GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 



August 11, 2004, 13:55:45 ; Search time 47.4522 Seconds 

(without alignments) 
1774.398 Million cell updates/sec 

US-09-864-675-4 
1574 

1 MRRDPAPGFSMLLFGVSLAC KCPVGYTGDRCQQFAMVNFS 2 98 



1586107 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 1586107 seqs, 282547505 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : A_Geneseq_29 Jan04 : * 

1: geneseqpl980s : * 

2: geneseqpl990s : * 

3: geneseqp2000s : * 

4: geneseqp2001s : * 

5 : geneseqp2 002s : * 

6: geneseqp2003as : * 

7: geneseqp2003bs : * 

8: geneseqp2004s : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 



Result 



Query 



Mo. 


Score 


Match 


Length 


DB 


ID 


Descript 


1 


1574 


100.0 


298 


5 


AAU11636 


Aaull636 


2 


1547 


98.3 


754 


2 


AAW27536 


Aaw27536 


3 


1505 


95.6 


330 


5 


AAU11635 


Aaull635 


4 


1505 


95.6 


422 


5 


ABB07894 


Abb07894 


5 


1505 


95.6 


426 


5 


ABB07893 


Abb07893 


6 


1478 


93.9 


330 


2 


AAW27537 


Aaw27537 


7 


1470 


93.4 


860 


2 


AAW63700 


Aaw637 00 


8 


776 


49.3 


469 


6 


ABG71639 


Abg71639 


9 


776 


49.3 


647 


2 


AAW48383 


Aaw4 8383 



Human Neu 
Rat cereb 
Human Neu 
Human neu 
Human neu 
Rat cereb 
Receptor 
Human sec 
Homo sapi 



10 


776 


49 . 


3 


647 


5 


ABG7164 4 


Abg71644 


HuTn^n thi 


11 


770 


48 . 


9 


469 


2 


AAW4 8382 


Aaw48382 


Hnmn saDi 

iiwiiiw ir 


12 


736 


4 6 . 


8 


4 07 


2 


AAW4 8 381 


Aaw4 8 381 


HmriD sani 


13 


736 


4 6 . 


8 


407 




ABG71638 


7iba71638 


HnTn;^n rnf^m 

1 1 UlllLd Ilt^lLL 


14 


716 


45 , 


5 


181 


2 


AAW4 838 0 


Aaw4 8 3 fi 0 


Mil's TTlUSCU 


15 


716 


45 . 


5 


-LUX 




ABG71637 


Abcr71637 


Ml 1 r 1 n s P> 


16 


716 


45 . 


5 


605 




ABG71636 


Aba71636 


Mil Ti ne^ TTiP^ 


17 


711 


45 . 


2 


605 


9 


AAW4 837 9 


Aaw4 R 37Q 


Mil c; mil *=1 Pll 


18 


545 


34 . 




422 


4 


AAG67901 


AafT67 <501 


Hi iTn;^ n n f^n 


19 


545 


34 . 


5 


A99 


4 


AAG67 939 


iT.ci y ^ ' J o -7 


Wiim^^n n*=iii 

riLilLlCLll liCU. 


2 0 


544 


^4 




A99 
^ ^ ^ 


9 


AARfi79 SR 


A;:i rfi7 9 SR 


riLUUdii yxx 


21 


544 


"^4 


a 


A99 


9 


AAR9finfi 1 
/T^vrv,;? \j V (J X 


Aa r Q 0 ft 1 

./T.CI X -7 U U O X 


oxxdx y X 


22 


544 


^4 




A99 


9 


AAWOQ "^79 


AawOQ ?79 


T-TiiTnan (ZfZV 
nuiiidii 


23 


544 


34 . 




422 


9 


AAW09 37 1 


AawOQ ^71 


HiiTn^n Tioii 

llLlilldli ilC^Li 


24 


544 


34 . 




422 


9 


AARR fifi9 R 


.r\ci X O U U ^ O 


Via. U U, X C: LL\j 


25 


544 


^4 




422 


9 


AAR874 fifi 


Aa r R7 4 f^fi 


oxxdX y X (J 


26 


543 


"^4 




A99 


9 


AAR S S S 4 


A;^ r S S S4 


X X ell 


27 


543 


^4 




A99 
^ ^ ^ 


9 


AAR4 ftQ9 ? 


Aa r4 fiQ9'^ 


\j\j£ XX Cll 


?ft 

^ o 


•J "-x ^ 


'^4 


4 


4 1 

1 X o 




AR.TDnni 1 

rvDU U \J U X X 


Ah-i 00 01 1 

r\U J U U U X X 


n LXLUd 11 1 1 C U. 


z. J? 


sj *± ^ 


"^4 

0*1. 




419 
1 X o 


R 


AR.TD n 04 Q 


AV^-i n 0 04 Q 


nuJUdii neu 






O O w 


a 
\j 


4 99 


9 


AARfl 7 4 (^7 


Aa r ft 74 (^7 
r\d X o f 1 


oxxax yxo 


J -L 


506 


o ^ • 


1 

-L 


7 R9 


A 
1 


AARfi77 SI 


A=ihfi7 7 SI 

r\d JJ D / / J X 


/TiU-L no d c X 




sn4 

W T 


o ^ ■ 


0 


1 

X O -7 


9 


AAW4 fl ? R R 


A;^w4 fl ^ R R 

.rt-d W *i o o o o 


U IlUt; X Xllc:U. 


o o 


sn4 

O U T 


o ^ ■ 




1 '^Q 

X O w/ 


D 


ARr;7 1 (^4 s 


r\lJ y / X D *± -J 


uon X dss 




4 Q S 




A 


R S S 

O J J 


1 


AARfi77 S7 


A^hi*^7 7 S7 


r\inxrio dCi 


35 


476 


30 . 


2 


342 


4 


AAB677 54 


A;^bfi77S4 


Am "i n o ^5 r* ■) 

/TllLXiHtJ dOX 


36 


473 




X 


1 R9 

X L> ^ 


9 


A AW9 7 R R 


Aaw97 S'^fl 


n til Lid 11 OCX 


37 


473 


30 . 


1 


323 


4 


AARfi77 S ^ 


A;:^hfi77 S^ 


i-UlLXllU dOX 


o o 


4 71 




Q 
_/ 


^1 7 
ox/ 




AARfi77 S9 


A;^bifi77 S9 

.r\dU D / / O ^ 


/AlUXllW dOX 


39 


37 5 


^ o . 




9 04 

^ W T 


4 


A Ari(^7 Q 09 


A;pnf^7 Q09 


nu.iiLdii lie U. 


A n 


sJ / o 




o 


9 04 
Z U fl 


4 


Az\r'f['7Q4n 

O / ^ U 


/\d y D / 31 U 


Hiiinan riGu 


41 


375 


23. 


8 


204 


5 


ABJ00012 


Abj00012 


Human neu 


42 


375 


23. 


8 


204 


5 


ABJ00050 


Abj00050 


Human neu 


43 


362.5 


23. 


0 


257 


2 


AAR28538 


Aar28538 


GGF2BPP3. 


44 


362.5 


23. 


0 


257 


2 


AAR55690 


Aar55690 


GGF2BPP3. 


45 


362.5 


23. 


0 


257 


2 


AAR46897 


Aar46897 


GGF2BPP3. 



ALIGNMENTS 



RESULT 1 
AAU11636 

ID AAU11636 standard; protein; 298 AA. 
XX 

AC AAU11636; 
XX 

DT 12-MAR-2002 (first entry) 
XX 

DE Human Neuregulin-2beta, NRG-2beta. 
XX 

PCW Human; neuregulin-2 ; NRG-2alpha; NRG-2beta; mitogenesis; cell survival; 

KW cell growth; cell differentiation; erbB receptor; cardiomyopathy; 

KW ischaemic damage; cardiac trauma; heart failure; atherosclerosis; 

KW vascular lesion; vascular hypertension; 

KW degenerative congenital vascular disease; myasthenia gravis; 

KW neurodegenerative disorder; peripheral neuropathy; 



KW sensory nerve fiber neuropathy; motor fiber neuropathy; 

KW sensory nerve fiber neuropathy; multiple sclerosis; 

KW amyotrophic lateral sclerosis; spinal muscular atrophy; nerve injury; 

KW Alzheimer's disease; Parkinson's disease; cerebellar ataxia; 

KW spinal cord injury; tumour; neurofibromatosis; transgenic animal. 

XX 

OS Homo sapiens. 
XX 

PN WO200189568-A1. 
XX 

PD 29-NOV-2001. 
XX 

PF 23-MAY-2001; 2001WO-US016896 . 
XX 

PR 23-MAY-2000; 2000US-0206495P . 
XX 

PA (CENE-) GENES PHARM INC. 
XX 

PI Marchionni MA; 
XX 

DR WPI; 2002-097612/13. 

DR N-PSDB; AAS18020. 
XX 

PT Neuregulin-2 polypeptide and polynucleotide useful for treating multiple 

PT sclerosis, spinal muscular atrophy, nerve injury, Alzheimer's disease, by 

PT increasing mitogenesis, survival, growth or differentiation of a cell, 
XX 

PS Claim 53; Fig 9; 79pp; English. 
XX 

CC The invention relates to a substantially pure neuregulin (NRG) -2 

CC polypeptide comprising or consisting of a sequence for human NRG-2alpha 

CC or NRG-2beta (clone 2b7) and the polynucleotides encoding the. Also 

CC included are a vector expressing the protein, a host cell comprising the 

CC vector, a transgenic non-human animal transformed with the vector or 

CC having a knockout mutation in one or both NRG-2 alleles and an anti-NRG-2 

CC antibody. Analysis of mutations in NRG-2 in an individual is useful for 

CC diagnosing an increased likelihood of developing a NRG-2-related disease 

CC or condition in a test subject. NRG-2 is useful for increasing the 

CC mitogenesis, survival, growth or differentiation of a cell (e.g. a 

CC neuronal cell), where the cell expresses an erbB receptor. NRG-2 is 

CC useful for treating diseases and disorders such as cardiomyopathy 

CC (preferably degenerative congenital disease) , ischaemic damage, cardiac 

CC trauma or heart failure or which has a condition affecting smooth muscle 

CC which include atherosclerosis, vascular lesion, vascular hypertension, 

CC and degenerative congenital vascular disease, myasthenia gravis, a 

CC neurodegenerative disorder, peripheral neuropathy, a sensory nerve fiber 

CC neuropathy, a motor fiber and a sensory nerve fiber neuropathy, multiple 

CC sclerosis, amyotrophic lateral sclerosis, spinal muscular atrophy, nerve 

CC injury, Alzheimer's disease, Parkinson's disease, cerebellar ataxia, and 

CC spinal cord injury. The antibody is useful for treatment of a tumour 

CC comprising inhibiting proliferation of a tumour cell preferably a glial 

CC tumour cell, for treating of neurofibromatosis by inhibiting glial cell 

CC mitogenesis. The present sequence represents NRG-2beta 

XX 

SQ Sequence 298 AA; 



Query Match 



100.0%; Score 1574; DB 5; Length 298; 



Best Local Similarity 100.0%; Pred, No. 2.7e-105; 

Matches 298; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I I I I I M M I I I I I I I I M I I I I I I I I I I I I I M I I I I I M I I I I I I I I I I I I I I I I I M 

Db 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

Qy 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I M I I I I I I I I I I 
Db 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 12 0 

Qy 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 18 0 

I I I I I I I I I { I I I I I I I M I I M I I I I I I I I I M I I I I I I M I I I I I I I I I I I I M I M I 

Db 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 18 0 

Qy 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

I I I I I M I I I I I I I I I I I I I I I I I I I I I I I M I I M I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

Qy 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRCQQFAMVNFS 298 

I I I I I I M I I I I I I I I I M I I I I I I I I I M I I I M I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRCQQFAMVNFS 2 98 



RESULT 2 
AAW27536 

ID AAW27536 standard; protein; 754 AA, 
XX 

AC AAW27536; 
XX 

DT 18-DEC-1997 (first entry) 
XX 

DE Rat cerebellum derived growth factor 1. 
XX 

KW Rat; cerebellum derived growth factor; CDGFl; screening; binding; 

KW modulation; erbB type receptor; identification; indication; risk; 

KW proliferation; differentiation; induction; neuron; hyperplasia; 

KW stem cell culture; intracerebral graft; alleviation; repair; 

KW behavioural defect; nervous system; central; peripheral; nerve; 

KW prothesis; damage; entubulation; cell survival; treatment; injury; 

KW trauma; ischaemia; ischemia; stroke; infection; disorder; inflammation; 

KW neurodegeneration; disease; Parkinson's; Huntingdon's; 

KW amylotrophic lateral sclerosis; sensory; retina; 

KW spinocerebellar degeneration; multiple sclerosis; neoplasia; 

KW amalignant glioma; medulloblastoma; neuroectodermal tumour. 

XX 

OS Rattus rattus. 



XX 

FH Key Location/Qualifiers 

FT Peptide 1. .23 

FT /label= sig_peptide 

FT Peptide 24. .754 

FT /label= mat_peptide 

FT Region 55 

FT /note= "potential N-glycosylation site" 

FT Domain 158. ,228 

FT /label= immmunoglobulin_like__domain 



FT Region 186 

FT /note= "potential N-glycosylation site" 

FT Domain 252. .2 97 

FT /label= epidermal_growth_f actor_like_domain 

FT Region 253 

FT /note= "characteristic cysteine of epidermal growth 

FT factor like domain" 

FT Region 254 

FT /note= "potential N-glycosylation site" 

FT Region 261 

FT /note= "characteristic cysteine of epidermal growth 

FT factor like domain" 

FT Region 267 

FT /note= "characteristic cysteine of epidermal growth 

FT factor like domain" 

FT Region 278 

FT /note= "characteristic cysteine of epidermal growth 

FT factor like domain" 

FT Region 28 0 

FT /note= "characteristic cysteine of epidermal growth 

FT factor like domain" 

FT Region 289 

FT /note^ "characteristic cysteine of epidermal growth 

FT factor like domain" 

FT Region 296 

FT /note= "potential N-glycosylation site" 

FT Cleavage-site 314. .315 

FT /label= potential_proteolytic_site 

FT Domain 316. .338 

FT /label= putative_transmembrane_domain 
XX 

PN WO9709425-A1. 
XX 

PD 13-MAR-1997. 
XX 

PF 09-SEP-1996; 96WO-US014484 . 
XX 

PR 08-SEP-1995; 95US-00525864 . 
XX 

PA (HARD ) HARVARD COLLEGE. 

PA (STRD ) UNIV LELAND S STANFORD. 

XX 

PI Chang H; 
XX 

DR WPI; 1997-192900/17. 

DR N-PSDB; AAT87922. 
XX 

PT Rat and human cerebellum-derived growth factors - used in the treatment 

PT of neuronal injury and proliferative disorders. 

XX 

PS Claim 1; Page 63-66; 94pp; English. 
XX 

CC The present sequence is rat cerebellum derived growth factor 1 (CDGFl) , 

CC which can be used to screen for modulators of CDGF binding to erbB type 

CC receptors. Identification of a modification or mutation in a CDGF gene^ 

CC or aberrant expression of a CDGF gene or levels of soluble CDGF may be 

CC used to indicate the risk of unwanted cell proliferation or 



CC differentiation. CDGF may be used to induce neuronal differentiation in 

CC stem cell culture, and maintain the integrity of a terminally 

CC differentiated neuronal cell culture, e.g. useful for intracerebral 

CC grafting to alleviate behavioural defects. CDGF may also be used in nerve 

CC protheses to repair central and peripheral nerve damage, especially where 

CC a crushed or severed axon is entubulated by a prosthetic. CDGF may also 

CC be used to enhance neuronal cell survival in the central or peripheral 

CC nervous system, to treat neurological conditions associated with nervous 

CC system injury, e.g. traumatic, chemical or vasal injury and deficits such 

CC as ischaemia resulting from stroke, infectious/inflammatory and tumour 

CC induced injury, chronic neurodegenerative disease including Parkinson *s 

CC and Huntingdon's, amylotrophic lateral sclerosis, spinocerebellar 

CC degeneration, chronic immunological disease of the nervous system 

CC including multiple sclerosis, disorders of the sensory neurons and 

CC degenerative diseases of the retina. CDGF may also be used to treat 

CC neoplastic or hyperplastic transformations, particularly of the central 

CC nervous system, e.g. amalignant gliomas, medulloblastomas and 

CC neuroectodermal tumours 

XX 

SQ Sequence 754 AA; 



Query Match 98.3%; Score 1547; DB 2; Length 754; 

Best Local Similarity 97.7%; Pred. No. 6.9e-103; 

Matches 291; Conservative 4; Mismatches 3; Indels 0; 



Gaps 



0; 



Qy 

Db 



1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I M I I I I I I I I I I M I I I I I I I I I 
1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLAPAGGSSSNSTREP 60 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 

61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTEQPLVFKTAFA 120 

121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

I : I I I I I : I I I I I I I I I I I I I I I I I I I M I I I I I : I I I I I I I I M I I I I I I I I I I I I I I 

121 PVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAAAGNPQPSYRWFK 180 

181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCE7VEN1LGKDTVRGRLYVNSVS 24 0 

I I I I I I I I I I I I I I I I M I I M I I I I I M I I I I I I I I I I I I I I I I I M I I I I I I : I I I I I 

181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLHVNSVS 24 0 



241 



298 



TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRCQQFAMVNFS 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRCQQFAMVNFS 298 



RESULT 3 
AAU11635 

ID AAU11635 standard; protein; 330 AA. 
XX 

AC AAU11635; 
XX 
DT 
XX 
DE 
XX 
KW 



12-MAR-2002 (first entry) 
Human Neuregulin-2alpha, NRG-2alpha. 

Human; neuregulin-2 ; NRG-2alpha; NRG-2beta; mitogenesis; cell survival; 



KW cell growth; cell differentiation; erbB receptor; cardiomyopathy; 

KW ischaemic damage; cardiac trauma; heart failure; atherosclerosis; 

KW vascular lesion; vascular hypertension; 

KW degenerative congenital vascular disease; myasthenia gravis; 

KW neurodegenerative disorder; peripheral neuropathy; 

KW sensory nerve fiber neuropathy; motor fiber neuropathy; 

KW sensory nerve fiber neuropathy; multiple sclerosis; 

KW amyotrophic lateral sclerosis; spinal muscular atrophy; nerve injury; 

KW Alzheimer's disease; Parkinson *s disease; cerebellar ataxia; 

KW spinal cord injury; tumour; neurofibromatosis; transgenic animal. 

XX 

OS Homo sapiens . 
XX 

PN WO200189568-A1. 
XX 

PD 29-NOV-2001. 
XX 

PF 23-MAY-2001; 2001WO-US016896 . 
XX 

PR 23-MAY~2000; 2000US-0206495P . 
XX 

PA (CENE-) CENES PHARM INC, 
XX 

PI Marchionni MA; 
XX 

DR WPI; 2002-097612/13. 

DR N-PSDB; AAS18019. 
XX 

PT Neuregulin-2 polypeptide and polynucleotide useful for treating multiple 

PT sclerosis, spinal muscular atrophy, nerve injury, Alzheimer's disease, by 

PT increasing mitogenesis, survival, growth or differentiation of a cell. 
XX 

PS Claim 53; Fig 7; 79pp; English. 
XX 

CC The invention relates to a substantially pure neuregulin (NRG) -2 

CC polypeptide comprising or consisting of a sequence for human NRG-2alpha 

CC or NRG-2beta (clone 2b7) and the polynucleotides encoding the. Also 

CC included are a vector expressing the protein, a host cell comprising the 

CC vector, a transgenic non-human animal transformed with the vector or 

CC having a knockout mutation in one or both NRG-2 alleles and an anti-NRG-2 

CC antibody. Analysis of mutations in NRG-2 in an individual is useful for 

CC diagnosing an increased likelihood of developing a NRG-2-related disease 

CC or condition in a test subject. NRG-2 is useful for increasing the 

CC mitogenesis, survival, growth or differentiation of a cell (e.g. a 

CC neuronal cell), where the cell expresses an erbB receptor. NRG-2 is 

CC useful for treating diseases and disorders such as cardiomyopathy 

CC (preferably degenerative congenital disease) , ischaemic damage, cardiac 

CC trauma or heart failure or which has a condition affecting smooth muscle 

CC which include atherosclerosis, vascular lesion, vascular hypertension, 

CC and degenerative congenital vascular disease, myasthenia gravis, a 

CC neurodegenerative disorder, peripheral neuropathy, a sensory nerve fiber 

CC neuropathy, a motor fiber and a sensory nerve fiber neuropathy, multiple 

CC sclerosis, amyotrophic lateral sclerosis, spinal muscular atrophy, nerve 

CC injury, Alzheimer's disease, Parkinson's disease, cerebellar ataxia, and 

CC spinal cord injury. The antibody is useful for treatment of a tumour 

CC comprising inhibiting proliferation of a tumour cell preferably a glial 

CC tumour cell, for treating of neurofibromatosis by inhibiting glial cell 



CC mitogenesis , The present sequence represents NRG-2alpha 
XX 

SQ Sequence 330 AA; 

Query Match 95.6%; Score 1505; DB 5; Length 330; 

Best Local Similarity 98.6%; Pred. No. 2,8e-100; 

Matches 285; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPWVEGKVQGLVPAGGSSSNSTREP 60 

I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I I I I 
Db 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

Qy 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M I I I I M 
Db 61 PASGRVALV?CVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

Qy 121 PLDTNGKNLKKEVGKILCTDCATRPKLK?ay[KSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I M I I I I 
Db 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

Qy 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I M I I I I M I I M 
Db 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

Qy 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 289 

I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I I I I I I : I II 
Db 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 289 



RESULT 4 


ABB07894 


ID 


ABB07894 standard; protein; 422 AA. 


XX 




AC 


ABB07894; 


XX 




DT 


03-JUL-2002 (first entry) 


XX 




DE 


Human neuregulin 2 isoform 6. 


XX 




KW 


Human; MUCl; mucin; glycoprotein; cytostatic; cancer; tumour; ECD; 


KW 


extracellular domain; neuregulin 2; isoform. 


XX 




OS 


Homo sapiens . 


XX 




PN 


WO200222685-A2. 


XX 




PD 


21-MAR-2002, 


XX 




PF 


ll-SEP-2001; 2001WO-US028548 . 


XX 




PR 


ll-SEP-2000; 2000US-0231841P. 


XX 




PA 


(KUFE/) KUFE D W. 


PA 


(OHNO/) OHNO T. 


XX 




PI 


Kufe DW, Ohno T; 


XX 





DR WPI; 2002-339864/37. 
XX 

PT Use of a mucin glycoprotein (MUCl) extracellular domain antagonist for 

PT manufacturing a medicant that inhibits the proliferation of MUC-1 

PT expressing cancer cells and that can treat cancers and reduce tumor 

PT growth. 
XX 

PS Claim 6; Page 56-58; 74pp; English. 
XX 

CC The invention relates to the use of a MUCl (mucin glycoprotein) 

CC extracellular domain (ECD) antagonist for the manufacture of a medicant 

CC to inhibit the proliferation of MUC-1 expressing cancer cells. MUCl ECD 

CC antagonists (optionally combined with a pharmaceutical carrier) can be 

CC administered to inhibit proliferation of MUCl-expressing cancer cells, 

CC useful to treat cancers e.g. skin cancer, prostate cancer and leukemia, 

CC especially in humans. The method may also be combined with administration 

CC of a chemotherapeutic agent (e.g. an alkylating agent, topisomerase etc) 

CC or radiation to treat cancer, especially to reduce tumour growth. The 

CC polypeptides are also useful in screening to identify MUCl ECD 

CC antagonists. The present sequence represents a human neuregulin 2 isoform 

CC 6, a fragment of which can bind to MUCl/ECD 

XX 

SQ Sequence 422 AA; 

Query Match 95.6%; Score 1505; DB 5; Length 422; 

Best Local Similarity 98.6%; Pred. No. 3.7e-100; 

Matches 285; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
Db 93 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 152 

Qy 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 153 PASGRVALVKVXDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 212 

Qy 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I M I I I I I I I I I I I I M I I I I I I I I I 
Db 213 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 272 

Qy 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

I I I I M I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 27 3 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 332 

Qy 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 289 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I II 
Db 333 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 381 



RESULT 5 
ABB07893 

ID ABB07893 standard; protein; 426 AA. 
XX 

AC ABB07893; 
XX 

DT 03-JUL-2002 (first entry) 
XX 



DE Human neuregulin 2 isoform 5. 
XX 

KW Human; MUCl; mucin; glycoprotein; cytostatic; cancer; tumour; ECD; 

KW extracellular domain; neuregulin 2; isoform. 

XX 

OS Homo sapiens. 
XX 

PN WO200222685-A2. 
XX 

PD 21-MAR-2002. 
XX 

PF ll-SEP-2001; 2001WO-US028548 . 
XX 

PR ll-SEP-2000; 2 OOOUS-02318 4 IP . 
XX 

PA (KUFE/) KUFE D W. 

PA (OHNO/) OHNO T. 
XX 

PI Kufe DW, Ohno T; 
XX 

DR WPI; 2002-339864/37. 
XX 

PT Use of a mucin glycoprotein (MUCl) extracellular domain antagonist for 

PT manufacturing a medicant that inhibits the proliferation of MUC-1 

PT expressing cancer cells and that can treat cancers and reduce tumor 

PT growth. 
XX 

PS claim 6; Page 53-55; 74pp; English. 
XX 

CC The invention relates to the use of a MUCl (mucin glycoprotein) 

CC extracellular domain (ECD) antagonist for the manufacture of a medicant 

CC to inhibit the proliferation of MUC-1 expressing cancer cells. MUCl ECD 

CC antagonists (optionally combined with a pharmaceutical carrier) can be 

CC administered to inhibit proliferation of MUCl-expressing cancer cells, 

CC useful to treat cancers e.g. skin cancer, prostate cancer and leukemia, 

CC especially in humans. The method may also be combined with administration 

CC of a chemotherapeutic agent (e.g. an alkylating agent, topisomerase etc) 

CC or radiation to treat cancer, especially to reduce tumour growth. The 

CC polypeptides are also useful in screening to identify MUCl ECD 

CC antagonists. The present sequence represents a hi^an neuregulin 2 isoform 

CC 5, a fragment of which can bind to MUCl/ECD 

XX 

SQ Sequence 426 AA; 

Query Match 95.6%; Score 1505; DB 5; Length 426; 
Best Local Similarity 98.6%; Pred. No. 3.7e-100; 

Matches 285; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 




Db 



93 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 152 



Qy 



61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 




Db 



153 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 212 



Qy 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 



I I I I I I I I I M I I I I I I I I M I I I I I I I I I I I I I I I M I M I I I I I I I I I I M I I I I I I I 

Db 213 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 272 

Qy 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

I I I M I I I I I I I I I I I I I I I I I I M I I I M I I I I I I M M M I I I M I I I M I I I I I M I 

Db 273 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 332 

Qy 241 TTLSSWSGHARKCNET7VKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 289 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I II 
Db 333 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 381 



RESULT 6 




AAW27537 




T "n 


AAW27537 standard; protein; 330 AA. 


vv 

AA 






AC 


AAW27537; 




V V 

AA 






JJ 1 


18-DEC-1997 


(first entry) 


YV 
AA 






Utj 


Rat cerebellum derived growth factor 2. 


VV 
AA 






JWV 


Rat; cerebellum derived growth factor; CDGF2; screening; binding; 


JWV 


modulation; 


erbB type receptor; identification; indication; risk; 


J\W 


proliferation; differentiation; induction; neuron; hyperplasia; 


WW 


stem cell culture; intracerebral graft; alleviation; repair; 


J\vv 


behavioural 


defect; nervous system; central; peripheral; nerve; 




prothesis ; 


damage; entubulation; cell survival; treatment; injury; 


KW 


trauma; ischaemia; ischemia; stroke; infection; disorder; inflammati^ 


JWV 


neurodegeneration; disease; Parkinson's; Huntingdon's; 


r\vv 


amylotrophic lateral sclerosis; sensory; retina; 


JWV 


spinocerebellar degeneration; multiple sclerosis; neoplasia; 


JWV 


amalignant 


glioma; medulloblastoma ; neuroectodermal tumour. 


AA 








Rattus rattus . 


vv 

AA 






r n 


Key 


Location/Qualifiers 


1? i 


Peptide 


1. .23 


FT 




/label= sig peptide 


FT 


Peptide 


24. .330 


FT 




/label= mat peptide 


FT 


Region 


55 


FT 




/note= "potential N-glycosylation site" 


FT 


Domain 


158, .228 


FT 




/label= immmunoglobulin_like_domain 


FT 


Region 


186 


FT 




/note= "potential N-glycosylation site" 


FT 


Domain 


252. .297 


FT 




/label= epidermal growth f actor_like__domain 


FT 


Region 


253 


FT 




/note= "characteristic cysteine of epidermal growth 


FT 




factor like domain" 


FT 


Region 


254 


FT 




/note= "potential N-glycosylation site" 


FT 


Region 


261 


FT 




/note= "characteristic cysteine of epidermal growth 


FT 




factor like domain" 



FT Region 267 

FT /note= "characteristic cysteine of epidermal growth 

FT factor like domain" 

FT Region 278 

FT /note= "characteristic cysteine of epidermal growth 

FT factor like domain" 

FT Region 280 

FT /note= "characteristic cysteine of epidermal growth 

FT factor like domain" 

FT Region 289 

FT /note= "characteristic cysteine of epidermal growth 

FT factor like domain" 

XX 

PN WO9709425-A1. 
XX 

PD 13-MAR-1997. 
XX 

PF 09-SEP-1996; 96WO-US014484 . 
XX 

PR 08-SEP-1995; 95US-00525864 , 
XX 

PA (HARD ) HARVARD COLLEGE. 

PA (STRD ) UNIV LELAND S STANFORD. 

XX 

PI Chang H; 
XX 

DR WPI; 1997-192900/17. 

DR N-PSDB; AAT87923. 
XX 

PT Rat and human cerebellum-derived growth factors - used in the treatment 

PT of neuronal injury and proliferative disorders. 

XX 

PS Claim 1; Page 70-71; 94pp; English. 
XX 

CC The present sequence is rat cerebellum derived growth factor 2 (CDGF2) , 

CC which can be used to screen for modulators of CDGF binding to erbB type 

CC receptors. Identification of a modification or mutation in a CDGF gene, 

CC or aberrant expression of a CDGF gene or levels of soluble CDGF may be 

CC used to indicate the risk of unwanted cell proliferation or 

CC differentiation. CDGF may be used to induce neuronal differentiation in 

CC stem cell culture, and maintain the integrity of a terminally 

CC differentiated neuronal cell culture, e.g. useful for intracerebral 

CC grafting to alleviate behavioural defects. CDGF may also be used in nerve 

CC protheses to repair central and peripheral nerve damage, especially where 

CC a crushed or severed axon is entubulated by a prosthetic. CDGF may also 

CC be used to enhance neuronal cell survival in the central or peripheral 

CC nervous system, to treat neurological conditions associated with nervous 

CC system injury, e.g. traumatic, chemical or vasal injury and deficits such 

CC as ischaemia resulting from stroke, infectious/inflammatory and tumour 

CC induced injury, chronic neurodegenerative disease including Parkinson's 

CC and Huntingdon's, amylotrophic lateral sclerosis, spinocerebellar 

CC degeneration, chronic immunological disease of the nervous system 

CC including multiple sclerosis, disorders of the sensory neurons and 

CC degenerative diseases of the retina. CDGF may also be used to treat 

CC neoplastic or hyperplastic transformations, particularly of the central 

CC nervous system, e.g. amalignant gliomas, medulloblastomas and 

CC neuroectodermal tumours 



XX 

SQ Sequence 330 AA; 



Query Match 93.9%; 
Best Local Similarity 96.2%; 
Matches 278; Conservative 



Score 1478; DB 2; 
Pred. No. 2.4e-98; 
5; Mismatches 6; 



Length 330; 



Indels 



0; Gaps 



0; 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLAPAGGSSSNSTREP 60 

61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I I I M I I I I I I I I I I I I I I I I I I I I I M 
61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTEQPLVFKTAFA 120 

121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

I : I I I I I : I I I I I M I I I I I I I I I M I I I M I M : I I I I I M I I I I I I I I I I I M I I I I 
121 PVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAAAGNPQPSYRWFK 180 

181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVWEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I : I I I I I 
181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLHVNSVS 240 

241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 289 

I I I I I I I I I I I I M I I I I I I I I M I I M I I I I I I I I I I I I I I : I II 
241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 289 



RESULT 7 
AAW63700 

ID AAW63700 standard; protein; 860 AA. 
XX 
AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
XX 
OS 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
XX 
PA 
XX 
DR 
DR 
XX 
PT 



AAW63700; 

29-SEP-1998 (first entry) 

Receptor type tyrosine kinase ErbB ligand. 

Receptor type tyrosine kinase ErbB; ligand; diagnostic agent; 
nervous disease; cancer. 

Rattus sp. 

JP10179166-A. 

07-JUL-1998, 

25-DEC-1996; 

25-DEC-1996; 



96JP-00356998. 
96JP-00356998 , 



(HIGA/) HIGASHIYAMA S. 

WPI; 1998-430952/37. 
N-PSDB; AAV43674. 

Gene coding the ligand of the tyrosine kinase ErbB receptor - useful for 



PT diagnosing and treating nervous diseases and cancer. 
XX 

PS Claim 1; Page 9-13; 17pp; Japanese. 
XX 

CC This represents the ligand of receptor type tyrosine kinase ErbB. A 

CC prokaryotic or eukaryotic host cell transformed by a recombinant vector 

CC containing the encoding DNA can be used for the recombinant production of 

CC the protein. The invention provides a method for inhibiting the formation 

CC of the ligand of receptor type tyrosine kinase ErbB in an animal using an 

CC antibody recognizing the protein. The ligand of the tyrosine kinase ErbB 

CC receptor and associated materials can be used for treating or diagnosing 

CC nervous diseases and cancers 
XX 

SQ Sequence 8 60 AA; 

Query Match 93.4%; Score 1470; DB 2; Length 860; 

Best Local Similarity 95.8%; Pred. No. 2.8e-97; 

Matches 277; Conservative 5; Mismatches 7; Indels 0; Gaps 0; 

Qy 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I I I I M I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I 
Db 109 MRRDPAPGSSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLAPAGGSSSNSTREP 168 

Qy 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I I M I I M I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I 
Db 169 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTEQPLVFKTAFA 228 

Qy 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

I : I I I I I : I I I I I I I I I I I I I I I I I I I I I M I I I : I I I I I I I I I M I I I I I M I I I I I I 
Db 229 PVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAAAGNPQPSYRWFK 288 

Qy 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

I I I I I I I I I I I I I I I I I M I I M I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I : I I I I I 
Db 289 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLHVNSVS 348 

Qy 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 2 89 

I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I : I II 
Db 349 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 397 



RESULT 8 


ABG71639 


ID 


ABG71639 standard; protein; 469 AA. 


XX 




AC 


ABG71639; 


XX 




DT 


14-JAN-2003 (first entry) 


XX 




DE 


Human second splice variant of Don-1. 


XX 




KW 


Human; Don-1; epidermal growth factor; EGF; neuregulin; 


KW 


glycoprotein ligand; cell proliferation; cell proliferative disorder; 


KW 


carcinoma; adenocarcinoma cell; myeloma; cell differentiation; 


KW 


cell survival; epithelial cell; wound healing; tumour formation; brain; 


KW 


vulnerary; cytostatic . 


XX 




OS 


Homo sapiens. 



XX 

FH Key Location/Qualifiers 

FT Misc-dif ference 14 

FT /note= "Encoded by AA" 
XX 

PN US2002127594-A1, 
XX 

PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2002US-0009624 1 . 
XX 

PR 22-JUN-2000; 2000US-005997 89 . 
XX 

PA (GEAR/) GEARING D P. 

PA (BUSF/) BUSFIELD S J. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 2003-039584/03. 

DR N-PSDB; ABS56036. 
XX 

PT Novel Don-1 polypeptide useful for stimulating proliferation of cells, 

PT for identifying proteins that interact with Don-1, and for regulating 

PT tumor formation and progression in brain. 
XX 

PS Claim 25; Fig 4; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene called Don 

CC -1, and alternate splice variants of Don-1, which are related to 

CC epidermal growth factors (EGF) such as neuregulins . Don-1 polypeptides 

CC are glycoprotein ligands . Both murine and human Don-1 sequences are 

CC cloned. The mouse Don-1 gene maps to chromosome 18. Don-1 polypeptides 

CC are useful for stimulating proliferation of a cell. Antibodies to Don-1 

CC polypeptides are useful for detecting Don-1 in a sample. The Don-1 

CC polypeptides are useful for treating and diagnosing cell proliferative 

CC disorders and play a role in the proliferation of carcinomas e.g. 

CC adenocarcinoma, myeloma, in cell differentiation, proliferation and 

CC survival. The polypeptides are also useful for inhibiting proliferation 

CC of adenocarcinoma cells, for stimulating the proliferation of cells such 

CC as epithelial cells to promote wound healing, for identifying proteins 

CC that interact with Don-1, and for regulating tumour formation and 

CC progression in the brain. The polynucleotide sequences encoding Don-1 may 

CC be used in gene therapy. The present sequence represents human second 

CC splice variant of Don-1 

XX 

SQ Sequence 469 AA; 



Query Match 49.3%; Score 776; DB 6; Length 469; 

Best Local Similarity 97.3%; Pred. No. 1.2e-47; 

Matches 14 4; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 201 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I M 
Db 31 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 90 



Qy 



202 SRLQFNKVKVEDAGEYVCE7VENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 261 
I I I M I I I I I I I M I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 



Db 91 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 150 



Qy 262 VNGGVCYYIEGINQLSCKCPVGYTGDRC 289 

M I I I I I I I I I I I I I I I I I I I : I II 
Db 151 VNGGVCYYIEGINQLSCKCPNGFFGQRC 178 



RESULT 9 




AAW4 ! 


3383 




ID 


AAW48383 standard; protein; 647 AA, 










AAW48383; 




YY 






JJi 


17-AUG-1998 


(first entry) 


V Y 

AA 






UlLi 


Homo sapiens ■ 


don-1 polypeptide. 


VV 
AA 






J\V¥ 


Murine; don-1 


gene; melanoma; treatment; adenocarcinoma; epithelia. 


KW 


proliferation 


; stimulation; treatment; tumours; skin; oesophagus; . 


"kTAT 


breast; liver 


; pancreas; colon; prostate; gastrointestinal tract; i 


i\W 


wound healing; transmembrane. 


YY 
AA 








Homo sapiens. 




YY 
AA 






r ri 


Key 


Location/Qualifiers 


FT 

r 1 


Domain 


54. .108 


FT" 




/note= "Ig domain" 


r i 


Domain 


142. .178 


r 1 




/note= "EGF domain" 


r 1 


Domain 


203. .225 






/note= "transmembrane domain" 




Domain 


226. .647 


FT 




/note= "cytoplasmic domain" 


YY 
AA 






PM 
IT IN 


WO9807736-A1, 




yy 






rU 


26-FEB-1998. 




YY 
AA 






PF 

IT C 


18-AUG-1997; 


97WO-US014585. 


YY 
AA 






PR 


19-AUG-1996; 


96US-00699591. 


PR 


19-NOV-1996; 


96US-00753007. 


yy 






PA 


(MILL-) MILLENNIUM BIOTHERAPEUTICS INC. 


XX 






PI 


Gearing DP, 


Busfield SJ; 


XX 






DR 


WPI; 1998-169084/15. 


DR 


N-PSDB; AAV17816. 


XX 






PT 


Mouse and human don-1 polypeptide ( s ) - useful for treatment of mel 


PT 


and adenocarcinoma (s ) , and for wound healing. 


XX 






PS 


Claim 25; Fig 


7; 121pp; English. 


XX 






cc 


The sequence 


is that encoded by a human don-1 gene splice variant. 


cc 


polypeptides 


stimulate proliferation of epithelial cells and thus 



CC implicated in melanomas and adenocarcinomas in which epithelial cells 

CC proliferate out of control. Compounds that interfere with don-l mediated 

CC cell proliferation can be used in the treatment of tumours such as 

CC melanomas and adenocarcinomas of the skin^. oesophagus, lung, breast, 

CC liver, pancreas, gastrointestinal tract, colon, prostate or uterus. 

CC Alternatively, don-l polypeptides can be used to stimulate epithelial 

CC cell proliferation, e.g. for wound healing 
XX 

SQ Sequence 647 AA; 

Query Match 49.3%; Score 776; DB 2; Length 647; 

Best Local Similarity 97.3%; Pred. No. 1.8e-47; 

Matches 144; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 201 

M I I M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 
Db 31 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 90 

Qy 202 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 261 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
Db 91 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 150 

Qy 262 VNGGVCYYIEGINQLSCKCPVGYTGDRC 289 

I I I I I I I I I I I I I I I I I I I I I : I II 
Db 151 VNGGVCYYIEGINQLSCKCPNGFFGQRC 178 



RESULT 10 
ABG71644 

ID ABG71644 standard; protein; 647 AA, 
XX 

AC ABG71644; 
XX 

DT 14-JAN-2003 (first entry) 
XX 

DE Human third splice variant of Don-l. 
XX 

KW Human; Don-l; epidermal growth factor; EGF; neuregulin; 

KW glycoprotein ligand; cell proliferation; cell proliferative disorder; 

KW carcinoma; adenocarcinoma cell; myeloma; cell differentiation; 

KW cell survival; epithelial cell; wound healing; tumour formation; brain; 

KW vulnerary; cytostatic. 

XX 

OS Homo sapiens . 
XX 

FH Key Location/Qualif iers 

FT Misc-dif f erence 14 

FT /note= "Encoded by AA" 

FT Misc-dif f erence 310 

FT /note= "Encoded by AGC" 

XX 

PN US2002127594-A1. 
XX 

PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2002US-00096241 . 
XX 



PR 22-JUN-2000; 2000US-005997 8 9 . 
XX 

PA (GEAR/) GEARING D P. 

PA (BUSF/) BUSFIELD S J. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 2003-039584/03. 

DR N-PSDB; ABS56045. 
XX 

PT Novel Don-1 polypeptide useful for stimulating proliferation of cells, 

PT for identifying proteins that interact with Don-1, and for regulating 

PT tumor formation and progression in brain. 
XX 

PS claim 25; Fig 7; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene called Don 

CC -1, and alternate splice variants of Don-1, which are related to 

CC epidermal growth factors (EGF) such as neuregulins . Don-1 polypeptides 

CC are glycoprotein ligands . Both murine and human Don-1 sequences are 

CC cloned. The mouse Don-1 gene maps to chromosome 18. Don-1 polypeptides 

CC are useful for stimulating proliferation of a cell. Antibodies to Don-1 

CC polypeptides are useful for detecting Don-1 in a sample. The Don-1 

CC polypeptides are useful for treating and diagnosing cell proliferative 

CC disorders and play a role in the proliferation of carcinomas e.g. 

CC adenocarcinoma, myeloma, in cell differentiation, proliferation and 

CC survival. The polypeptides are also useful for inhibiting proliferation 

CC of adenocarcinoma cells, for stimulating the proliferation of cells such 

CC as epithelial cells to promote wound healing, for identifying proteins 

CC that interact with Don-1, and for regulating tumour formation and 

CC progression in the brain. The polynucleotide sequences encoding Don-1 may 

CC be used in gene therapy. The present sequence represents human third 

CC splice variant of Don-1 

XX 

SQ Sequence 647 AA; 

Query Match 49.3%; Score 776; DB 6; Length 647; 

Best Local Similarity 97.3%; Pred. No. 1.8e-47; 

Matches 144; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 201 

I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I M I I I I I I I 

Db 31 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 90 

Qy 202 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGH7VRKCNETAKSYC 261 

I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
Db 91 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 150 

Qy 2 62 VNGGVCYYIEGINQLSCKCPVGYTGDRC 2 89 

I I I I I I M I I I I I I I I I I I I I : I II 
Db 151 VNGGVCYYIEGINQLSCKCPNGFFGQRC 17 8 



RESULT 11 
AAW48382 

ID AAW48382 standard; protein; 469 AA. 
XX 



AC AAW48382; 
XX 

DT 17-AUG-1998 (first entry) 
XX 

DE Homo sapiens don-1 polypeptide. 
XX 

KW Murine; don-1 gene; melanoma; treatment; adenocarcinoma; epithelial cell; 
KW proliferation; stimulation; treatment; tumours; skin; oesophagus; lung; 
KW breast; liver; pancreas; colon; prostate; gastrointestinal tract; uterus; 
KW wound healing; transmembrane. 
XX 

OS Homo sapiens . 
XX 

FH Key Location/Qualifiers 

FT Domain 54. .108 

FT /note= "Ig domain" 

FT Domain 142. .178 

FT /note= "EGF domain" 

FT Domain 203. .225 

FT /note= "transmembrane domain" 

FT Domain 22 6. .4 69 

FT /note= "cytoplasmic domain" 

XX 

PN WO9807736-A1. 
XX 

PD 26-FEB-1998. 
XX 

PF 18-AUG-1997; 97WO-US014585 . 
XX 

PR 19-AUG-1996; 96US-00699591 . 
PR 19-NOV-1996; 96US-00753007 . 
XX 

PA (MILL-) MILLENNIUM BIOTHERAPEUTICS INC. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 1998-169084/15. 
DR N-PSDB; AAV17815. 
XX 

PT Mouse and human don-1 polypeptide ( s ) - useful for treatment of melanomas 

PT and adenocarcinoma ( s ) , and for wound healing. 

XX 

PS Claim 25; Fig 4; 121pp; English. 
XX 

CC The sequence is that encoded by a human don-1 gene splice variant. Don-1 
CC polypeptides stimulate proliferation of epithelial cells and thus are 
CC implicated in melanomas and adenocarcinomas in which epithelial cells 
CC proliferate out of control. Compounds that interfere with don-1 mediated 
CC cell proliferation can be used in the treatment of tumours such as 
CC melanomas and adenocarcinomas of the skin, oesophagus, lung, breast, 
CC liver, pancreas, gastrointestinal tract, colon, prostate or uterus. 
CC Alternatively, don-1 polypeptides can be used to stimulate epithelial 
CC cell proliferation, e.g. for wound healing 
XX 

SQ Sequence 469 AA; 



Query Match 



48.9%; Score 770; DB 2; Length 469; 



Best Local Similarity 96.6%; Pred. No. 3.4e-47; 

Matches 143; Conservative 1; Mismatches 4; Indels 0; Gaps 0; 



Qy 



142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 2 01 




Db 



31 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 90 



Qy 



2 02 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 261 




Db 



91 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 150 



Qy 



262 VNGGVCYYIEGINQLSCKCPVGYTGDRC 289 




Db 



151 VNGGVCYYIEGINQLSCKCPNGFFAQRC 178 



RESULT 12 
AAW48381 

ID AAW48381 standard; protein; 407 AA. 
XX 

AC AAW4 8381; 
XX 

DT 17-AUG-1998 (first entry) 
XX 

DE Homo sapiens don-1 polypeptide. 
XX 

KW Murine; don-1 gene; melanoma; treatment; adenocarcinoma; epithelial cell; 
KW proliferation; stimulation; treatment; tumours; skin; oesophagus; lung; 
KW breast; liver; pancreas; colon; prostate; gastrointestinal tract; uterus; 
KW wound healing; transmembrane, 
XX 

OS Homo sapiens . 
XX 

FH Key Location/Qualif iers 

FT Domain 16. .70 

FT /note= "Ig domain" 

FT Domain 104. .140 

FT /note= "EGF domain" 

FT Region 157. .164 

FT /note= " juxtamembrane region" 

FT Domain 173. .195 

FT /note= "transmembrane domain" 

FT Domain 196. .4 07 

FT /note= "cytoplasmic domain" 

XX 

PN WO9807736-A1. 
XX 

PD 26-FEB-1998. 
XX 

PF 18-AUG-1997; 97WO-US014585 . 
XX 

PR 19-AUG-1996; 96US-00699591 . 
PR 19-NOV-1996; 96US-00753007 . 
XX 

PA (MILL-) MILLENNIUM BIOTHERAPEUTICS INC. 
XX 

PI Gearing DP, Busfield SJ; 



XX 

DR WPI; 1998-169084/15. 

DR N-PSDB; AAV17814. 
XX 

PT Mouse and human don-1 polypeptide ( s ) - useful for treatment of melanomas 

PT and adenocarcinoma ( s ) , and for wound healing. 

XX 

PS Claim 25; Fig 3; 121pp; English. 
XX 

CC The sequence is that encoded by a human don-1 gene splice variant. Don-1 

CC polypeptides stimulate proliferation of epithelial cells and thus are 

CC implicated in melanomas and adenocarcinomas in which epithelial cells 

CC proliferate out of control. Compounds that interfere with don-1 mediated 

CC cell proliferation can be used in the treatment of tumours such as 

CC melanomas and adenocarcinomas of the skin, oesophagus, lung, breast, 

CC liver, pancreas, gastrointestinal tract, colon, prostate or uterus. 

CC Alternatively, don-1 polypeptides can be used to stimulate epithelial 

CC cell proliferation, e.g. for wound healing 
XX 

SQ Sequence 407 AA; 

Query Match 46.8%; Score 736; DB 2; Length 407; 

Best Local Similarity 97.1%; Pred. No, 8e-45; 

Matches 136; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I M I I I I I I I I I I I I 
Db 1 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 60 

Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I M I I I I I I M M 
Db 61 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 120 

Qy 270 lEGINQLSCKCPVGYTGDRC 28 9 

M I I I I I I M I I I : I II 
Db 121 lEGINQLSCKCPNGFFGQRC 14 0 



RESULT 13 




ABG71638 




ID 


ABG71638 standard; protein; 407 AA. 




XX 






AC 


ABG71638; 




XX 






DT 


14-JAN-2003 (first entry) 




XX 






DE 


Human membrane-bound splice variant of Don-1. 




XX 






KW 


Human; Don-1; epidermal growth factor; EGF; neuregulin; 


KW 


glycoprotein ligand; cell proliferation; cell 


proliferative disorder; 


KW 


carcinoma; adenocarcinoma cell; myeloma; cell 


differentiation; 


KW 


cell survival; epithelial cell; wound healing; 


tumour formation; brain; 


KW 


vulnerary; cytostatic , 




XX 






OS 


Homo sapiens. 




XX 






PN 


US2002127594-A1. 





XX 

PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2 002US-00096241 . 
XX 

PR 22-JUN-2000; 2 OOOUS-00599789 . 
XX 

PA (GEAR/) GEARING D P. 

PA (BUSF/) BUSFIELD S J. 
XX 

PI Gearing DP, Busfieid SJ; 
XX 

DR WPI; 2003-039584/03. 

DR N-PSDB; ABS56035. 
XX 

PT Novel Don-1 polypeptide useful for stimulating proliferation of cells, 

PT for identifying proteins that interact with Don-1, and for regulating 

PT tumor formation and progression in brain. 
XX 

PS claim 25; Fig 3; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene called Don 

CC -1, and alternate splice variants of Don-1, which are related to 

CC epidermal growth factors (EGF) such as neuregulins . Don-1 polypeptides 

CC are glycoprotein ligands . Both murine and human Don-1 sequences are 

CC cloned. The mouse Don-1 gene maps to chromosome 18. Don-1 polypeptides 

CC are useful for stimulating proliferation of a cell. Antibodies to Don-1 

CC polypeptides are useful for detecting Don-1 in a sample. The Don-1 

CC polypeptides are useful for treating and diagnosing cell proliferative 

CC disorders and play a role in the proliferation of carcinomas e.g. 

CC adenocarcinoma, myeloma, in cell differentiation, proliferation and 

CC survival. The polypeptides are also useful for inhibiting proliferation 

CC of adenocarcinoma cells, for stimulating the proliferation of cells such 

CC as epithelial cells to promote wound healing, for identifying proteins 

CC that interact with Don-1, and for regulating tumour formation and 

CC progression in the brain. The polynucleotide sequences encoding Don-1 may 

CC be used in gene therapy. The present sequence represents human membrane- 

CC bound splice variant of Don-1 

XX 

SQ Sequence 4 07 AA; 

Query Match 46.8%; Score 736; DB 6; Length 407; 
Best Local Similarity 97.1%; Pred. No. 8e-45; 

Matches 136; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 60 

Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 

I I I I I M I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I I I I M I I I I I I M I I I M 

Db 61 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 12 0 



Qy 270 lEGlNQLSCKCPVGYTGDRC 2 89 

I I I I I I I I I I I I I : I II 
Db 121 lEGINQLSCKCPNGFFGQRC 140 



RESULT 14 
AAW48380 

ID 7iAW48380 standard; protein; 181 AA. 
XX 

AC AAW48380; 
XX 

DT 17-AUG-1998 (first entry) 
XX 

DE Mus musculus don-1 polypeptide. 
XX 

KW Murine; don-1 gene; melanoma; treatment; adenocarcinoma; epithelial cell; 

KW proliferation; stimulation; treatment; tumours; skin; oesophagus; lung; 

KW breast; liver; pancreas; colon; prostate; gastrointestinal tract; uterus; 

KW wound healing; transmembrane. 
XX 

OS Mus musculus. 
XX 

FH Key Location/Qualifiers 

FT Domain 104. .140 

FT /note= "EGF domain" 

XX 

PN WO9807736-A1. 
XX 

PD 26-FEB-1998. 
XX 

PF 18-AUG-1997; 97WO-US014585 . 
XX 

PR 19-AUG-1996; 96US-00699591 . 

PR 19-NOV-1996; 96US-0 0753007 . 
XX 

PA (MILL-) MILLENNIUM BIOTHERAPEUTICS INC. 
XX 

PI Gearing DP^ Busfield SJ; 
XX 

DR WPI; 1998-169084/15. 

DR N-PSDB; AAV17813. 
XX 

PT Mouse and human don-1 polypeptide ( s ) - useful for treatment of melanomas 

PT and adenocarcinoma ( s ) , and for wound healing. 

XX 

PS Claim 25; Fig 2; 121pp; English. 
XX 

CC The sequence is that encoded by a murine don-1 gene splice variant. Don-1 

CC polypeptides stimulate proliferation of epithelial cells and thus are 

CC implicated in melanomas and adenocarcinomas in which epithelial cells 

CC proliferate out of control. Compounds that interfere with don-1 mediated 

CC cell proliferation can be used in the treatment of tumours such as 

CC melanomas and adenocarcinomas of the skin, oesophagus, lung, breast, 

CC liver, pancreas, gastrointestinal tract, colon, prostate or uterus. 

CC Alternatively, don-1 polypeptides can be used to stimulate epithelial 

CC cell proliferation, e.g. for wound healing 
XX 

SQ Sequence 181 AA; 



Query Match 45.5%; Score 716; DB 2; Length 181; 

Best Local Similarity 94.3%; Pred. No. 8.5e-44; 



Matches 132; Conservative 4; Mismatches 4; Indels 0; Gaps 0; 



Qy 150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 

I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MKSQTGEVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNVRKNSRLQFNKV 60 

Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 

: I I I I I I I I I I I I M I I I I I I M I I : I I I I I M M M I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 RVEDAGEYVCEAENILGKDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 120 

Qy 270 lEGINQLSCKCPVGYTGDRC 289 

M I I I I I I I I I I I : I II 
Db 121 lEGINQLSCKCPNGFFGQRC 140 



RESULT 15 


ABG71637 


X u 




V V 








AA. 




U i 


1 4 — .TAM— 9 0 n ^-Fire^f- f^rrhrv^ 


XX 




JJiL 


lYiuj-ine secxreuea spxic-e vdiTiciiiL- ox uon x. 


vv 

AA 






Mn "rnnf^* Finn — 1 • (=*ni Hp> 'ttti;^ 1 n "rnwhh ?i cf' dT ' F.frP* ne^urf^fiulin' Tnou s ; 






KW 


V^CL J_ J- llt^lLLCL f dluL^llVur -L. ^-L il WlLLCl C J i~ f 111 V -1. V^liLd f ^ J — L V^-l. J-J^v^i-vZ^lJ. / 


A. W 


t—e^'l 1 Qiir-^rn 1 • r^f^ll • woi mH l^f^^^liTin* ■hmnon'r "Fpi kiti;^ ■|~"ioTi" lnT';^'in: 

L OLl.J-V_LVd_L/ v!^ Xl^liC^J LdJ- Vw^-L-Lf VVWLi.ll llCZd-I LllU^ 1 IV-V Lll- J_ W J- 1 L Id -L W 1 1 f J_ u ^ 1 1 / 


KW 


m ilTi^T;^"r\7* r* ^/l" "h ;^ "h "i r" 

VLlXliCLdJ__y / 1— wo I— dL-_LW ■ 


yy 




OS 


Mus sp . 


XX 




PM 




XX 




PD 


12-SEP-2002. 


XX 




PF 


12-MAR-2002; 2002US-00096241 , 


XX 




PR 


22-JUN-2000; 2000US-00599789 . 


XX 




PA 


(GEAR/) GEARING D P. 


PA 


(BUSF/) BUSFIELD S J. 


XX 




PI 


Gearing DP, Busfield SJ; 


XX 




DR 


WPI; 2003-039584/03. 


DR 


N-PSDB; ABS56034. 


XX 




PT 


Novel Don-1 polypeptide useful for stimulating proliferation of cells. 


PT 


for identifying proteins that interact with Don-1, and for regulating 


PT 


tumor formation and progression in brain. 


XX 




PS 


Claim 25; Fig 2; 66pp; English. 


XX 




CC 


The present invention relates to the isolation of a novel gene called Don 


CC 


-1, and alternate splice variants of Don-1, which are related to 



CC epidermal growth factors (EGF) such as neuregulins. Don-1 polypeptides 

CC are glycoprotein ligands . Both murine and human Don-1 sequences are 

CC cloned. The mouse Don-1 gene maps to chromosome 18. Don-1 polypeptides 

CC are useful for stimulating proliferation of a cell. Antibodies to Don-1 

CC polypeptides are useful for detecting Don-1 in a sample. The Don-1 

CC polypeptides are useful for treating and diagnosing cell proliferative 

CC disorders and play a role in the proliferation of carcinomas e.g. 

CC adenocarcinoma, myeloma, in cell differentiation, proliferation and 

CC survival. The polypeptides are also useful for inhibiting proliferation 

CC of adenocarcinoma cells, for stimulating the proliferation of cells such 

CC as epithelial cells to promote wound healing, for identifying proteins 

CC that interact with Don-1, and for regulating tumour formation and 

CC progression in the brain. The polynucleotide sequences encoding Don-1 may 

CC be used in gene therapy. The present sequence represents murine secreted 

CC splice variant of Don-1 

XX 

SQ Sequence 181 AA; 

Query Match 45.5%; Score 716; DB 6; Length 181; 

Best Local Similarity 94.3%; Pred. No. 8.5e-44; 

Matches 132; Conservative 4; Mismatches 4; Indels 0; Gaps 0; 
Qy 150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 



Db 



I I I I t I • I I I I I I I I I ) I I I I I I I I I I I I I I 1 I I I I I I I I I I 1111 

1 MKSQTGEVGEKQSLKCEA7\AGNPQPSYRWFKDGKELNRSRDIRIKYGNVRKNSRLQFNKV 60 



Qy 



210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 



Db 



- I I I I I I I I I I I I I - II > t I ( I I I I I r 

61 RVEDAGEYVCEAENILGKDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 120 



QY 



270 lEGINQLSCKCPVGYTGDRC 28 9 



Db 



121 lEGINQLSCKCPNGFFGQRC 140 



Search completed: August 17, 2004, 14:10:49 
Job time : 48.4522 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on: August 17, 2004, 14:09:05 ; Search time 14.7102 Seconds 

(without alignments) 
1045.842 Million cell updates/sec 



Title: US-09-8 64-67 5-4 

Perfect score: 1574 

Sequence: 1 MRRDPAPGFSMLLFGVSLAC KCPVGYTGDRCQQFAMVNFS 2 98 

Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 389414 seqs, 51625971 residues 

Total number of hits satisfying chosen parameters: 389414 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database 



Issued_Patents_AA: * 

1 : /cgn2_6/ptodata/2/iaa/5A_COMB.pep: * 

2 : /cgn2_6/ptodata/2/iaa/5B_COMB.pep: * 

3 : /cgn2_6/ptodata/2/iaa/6A_COMB. pep : * 

4 : /cgn2_6/ptodata/2/iaa/6B_COMB.pep: * 

5 : /cgn2_6/ptodata/2/iaa/PCTUS_COMB.pep: ■ 

6 : /cgn2_6/ptodata/2/iaa/backf ilesl .pep: ■ 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

% 

Result Query 
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Sequence 
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Sequence 
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Sequence 
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49. 
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us- 


09- 


398- 
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Sequence 
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Appli 


5 
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3 
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us- 


08- 


753- 


007A-32 


Sequence 


32 


, Appl 


6 


776 


49. 
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us- 


09- 


398- 


496-32 


Sequence 


32 


, Appl 


7 


736 


46. 


8 
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3 


us- 


08- 


753- 


007A-6 


Sequence 


6, 


Appli 


8 


736 


46. 


8 


407 


3 


us- 


09- 


398- 


496-6 


Sequence 


6, 


Appli 


9 


716 


45, 


5 
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3 


us- 


08- 


753- 


007A-4 


Sequence 


4, 


Appli 


10 


716 


45. 


5 


181 


3 


us- 


09- 


398- 


496-4 


Sequence 


4, 


Appli 


11 


716 


45. 


5 


605 


3 


us- 


08- 


753- 


007A-2 


Sequence 


2, 


Appli 



12 


716 


45 . 


5 


£1 r\ c 




TT C* 

US- 


09- 


o n o 

o- 






2, Appli 


13 


545 


34 . 


6 


411 


o 

■D 


US- 


08- 


470- 


33 y~ xo y 




X 0 / 


App 


14 


545 


34 . 


6 


422 


A 

4 


us- 


08- 


A 








App 


15 


545 


34 . 


6 


456 


A 

4 


us- 


08- 


A r '1 

^b 1- 


dUZ — J D D 


CI n /^l 1 £i A 

o ecjueiiC* 


O o D ^ 


App 


16 


545 


34 , 


6 


601 


Q 


T T C 

us- 


08- 


A n r\ 


O Jo-Zoo 


Q ^ «i 1 £a "n A 

o e cju. e 11 ti 


Z / 


App 


17 


545 


34 . 


6 


601 


A 

4 


T T C 

US- 


08- 


A 

^b l- 


r\ o '3 '7 Q 


o c que Hoc 


OZ ^ / 


App 


18 


545 


34 . 


6 


610 


Q 

o 


T T O 
US~ 


08- 




o '11 c. o "ic^ 
j500-ii OD 




Z.v3 D , 


App 


19 


545 


34 . 


6 


610 


A 

4 


T T C 

US- 


0 8- 


4 6/- 




sequence 


o Jz , 


App 


20 


545 


34 . 


6 


635 


4 


US- 


08- 






Sequence 




App 


21 


545 


34 . 


6 


644 


A 

4 


US- 


08- 


467- 


602-3 / 4 


Sequence 


'3 T /t 
o / 4 , 


App 


22 


545 


34 . 


6 


818 


3 


US- 


08- 


470- 




Sequence 


z34 , 


App 


23 


545 


34 . 


6 


818 


4 


us- 


08- 


467- 


602-321 


Sequence 


OZi- , 


App 


24 


545 


34 . 


6 


827 


S 


us- 


08- 


4/0- 


O O C O O T 


Sequence 


O "3 T 
Z O / , 


App 


25 


545 


34 . 


6 


827 


4 


US- 


08- 


467- 


A O O O O 


Sequence 


"3 O O 

oo J , 


App 


26 


545 


34 , 


6 


852 


4 


us- 


08- 


Ab 1- 


A O O "3 

bUz-obo 


Sequence 


O O 

363 , 


App 


27 


545 


34 . 


6 


861 


4 


us- 


08- 


467- 


A O O T d 

60z-o / D 


Sequence 


Q n m 


App 


28 


545 


34 . 


6 


865 


3 


us- 


08- 


470- 


O O C O O CI 

33b-z3b 


Sequence 


O Q CL 
Z oD , 


App 


29 


545 


34 . 


6 


865 


A 

4 


us- 


08- 


4b /- 


bUz— ozz 


Sequence 


O A O 


App 


30 


545 


34 . 


6 


874 


3 


us- 


08- 


470- 


335-238 


Sequence 


238, 


App 


31 


545 


34 . 


6 


874 


4 


us- 


08- 


467- 


602-334 


Sequence 


334, 


App 


32 


545 


34 . 


6 


899 


4 


us- 


08- 


467- 


602-364 


Sequence 


364, 


App 


33 


545 


34 . 


6 


908 


4 


us- 


08- 


467- 


602-376 


Sequence 


376, 


App 


34 


544 


34 , 


6 


422 


1 


us- 


08- 


036- 


5ddB-1 /O 


Sequence 


170, 


App 


35 


544 


34 . 


6 


422 


1 


us- 


08- 


A a 

469- 


C ^ A T T A 

569-1/0 


Sequence 


170, 


App 


36 


544 


34 . 


6 


422 


1 


us- 


08- 


428- 


A A O 

92 6-3 


Sequence 


3, Appli 


37 


544 


34 . 


6 


422 


1 


us- 


08- 


249- 


O A A TV 1 T A 

3z2A-l /O 


Sequence 


170, 


App 


38 


544 


34 . 


6 


422 


1 


us- 


08- 


428- 


927-3 


Sequence 


3, Appli 


39 


544 


34 , 


6 


422 


1 


us- 


08- 


42 8- 


A A O O 

2 98-3 


Sequence 


3, Appli 


40 


544 


34 , 


6 


422 


1 


us- 


08- 


O O A 

339- 


ol /-3 


Sequence 


3, Appli 


41 


544 


34. 


6 


422 


1 


us- 


08- 


469- 


52 6A-170 


Sequence 


170, 


App 


42 


544 


34. 


6 


422 


2 


us- 


08- 


734- 


591A-170 


Sequence 


170, 


App 


43 


544 


34. 


6 


422 


2 


us- 


08- 


469- 


660-170 


Sequence 


170, 


App 


44 


544 


34. 


6 


422 


3 


us- 


08- 


341- 


018-72 


Sequence 


72, 


Appl 


45 


544 


34. 


6 


422 


3 


us- 


08- 


-470- 


335-170 


Sequence 


170, 


App 



AlilGNMENTS 



RESULT 1 

US-08-525-864A-2 

; Sequence 2, Application US/08525864A 
; Patent No. 5912326 

GENERAL INFORMATION: 
; APPLICTVNT: Chang, Han 

; TITLE OF INVENTION: Cerebellum-derived Growth Factors, and Uses 
; TITLE OF INVENTION: Related thereto 
; NUMBER OF SEQUENCES: 18 

; CORRESPONDENCE ADDRESS: 

ADDRESSEE: LAHIVE & COCKFIELD 

STREET: 28 State Street 
; CITY: Boston 

; STATE: Massachusetts 

COUNTRY: USA 

ZIP: 02109 
; COMPUTER READABLE FORM: 

; MEDIUM TYPE: Floppy disk 



; COMPUTER: IBM PC compatible 

; OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: AscII (text) 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/525, 864A 

FILING DATE: 8-SEP-1995 

CLASSIFICATION: 530 
ATTORNEY/AGENT INFORMATION: 

NAME: Kara, Catherine J. 

REGISTRATION NUMBER: 41,106 

REFERENCE/ DOCKET NUMBER: HUI-017 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: (617)227-74 00 
; TELEFAX: (617)742-4214 

; INFORMATION FOR SEQ ID NO: 2: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 754 amino acids 

; TYPE: amino acid 

; TOPOLOGY: linear 

; MOLECULE TYPE: protein 
US-08-525-864A-2 



Query Match 98,3%; Score 1547; DB 2; Length 754; 

Best Local Similarity 97.7%; Pred. No. 4.7e-132; 

Matches 291; Conservative 4; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLAPAGGSSSNSTREP 60 

Qy 61 PASGRVT^VKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 12 0 

M I I I I I I I I I I I I M I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTEQPLVFKTAFA 120 



Qy 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

1:1 I I I I : I I I I I I I M M I I I I I I I M I I M I I : I I I I I I M I I I I I I I I I I I M I I I 
Db 121 PVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAAAGNPQPSYRWFK 180 

Qy 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M I I I I I I I : I I I I I 
Db 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLHVNSVS 240 

Qy 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRCQQFAMVNFS 298 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M I M I I I I 
Db 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRCQQFAMVNFS 2 98 



RESULT 2 

US-08-525-864A-4 

; Sequence 4, Application US/08525864A 
; Patent No. 5912326 
; GENERAL INFORMATION: 

APPLICANT: Chang, Han 

TITLE OF INVENTION: Cerebellum-derived Growth Factors, and Uses 
TITLE OF INVENTION: Related thereto 
; NUMBER OF SEQUENCES: 18 

CORRESPONDENCE ADDRESS: 



ADDRESSEE: LAHIVE & COCKFIELD 
STREET: 28 State Street 
CITY: Boston 
STATE: Massachusetts 
COUNTRY: USA 
ZIP: 02109 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: AscII (text) 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/525, 864A 
FILING DATE: 8-SEP-1995 
CLASSIFICATION: 530 
ATTORNEY/AGENT INFORMATION: 
NAME: Kara, Catherine J. 
REGISTRATION NUMBER: 41,106 
REFERENCE/DOCKET NUMBER: HUI-017 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: ( 617 ) 22 7-74 00 
TELEFAX: ( 617 ) 742-42 14 
INFORMATION FOR SEQ ID NO: 4: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 330 amino acids 
TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-08-525-864A-4 

Query Match 93.9%; Score 147 8; DB 2; Length 330; 

Best Local Similarity 96.2%; Pred. No. 2.8e-126; 

Matches 278; Conservative 5; Mismatches 6; Indels 0; Gaps 0; 

Qy 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I M I I M I M I I I I I I M I I I I I I I 
Db 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLAPAGGSSSNSTREP 60 

Qy 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 
Db 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTEQPLVFKTAFA 120 

Qy 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

I : I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I M I I I I 
Db 121 PVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAAAGNPQPSYRWFK 18 0 

Qy 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I 
Db 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLHVNSVS 240 

Qy 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 289 

I I I I I I I I I I I I I I I I I I M I I I I I I M I I I I I I I I I I I I I I : I II 
Db 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 289 



RESULT 3 

US-08-753-007A-8 



; Sequence 8, Application US/08753007A 

; Patent No. 6074841 

; GENERAL INFORMATION: 

; APPLICANT: Gearing, David P. 

; APPLICANT: Busfield, Samantha J. 

TITLE OF INVENTION: DON-1 GENE 7\ND POLYPEPTIDES 
; TITLE OF INVENTION: AND USES THEREFOR 

; NUMBER OF SEQUENCES: 33 
; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 
; CITY: Boston 

STATE : MA 

COUNTRY: US 
; ZIP: 02110-2804 

; COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
; COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 
; SOFTWARE: FastSEQ Version 2,0 

; CURRENT APPLICATION DATA: 

; APPLICATION NUMBER: US/08/753, 007A 

FILING DATE: 19-NOV-1996 

CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
; ATTORNEY/AGENT INFORMATION: 

; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 

REFERENCE/DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617-542-5070 
; TELEFAX: 617-542-8906 

; TELEX: 

; INFORMATION FOR SEQ ID NO: 8: 
; SEQUENCE CHARACTERISTICS: 

; LENGTH: 469 amino acids 

; TYPE: amino acid 

STRANDEDNESS: not relevant 
; TOPOLOGY: linear 

; MOLECULE TYPE: protein 
; FRAGMENT TYPE: internal 

US-08-753-007A-8 



Query Match 49.3%; Score 776; DB 3; Length 469; 

Best Local Similarity 97.3%; Pred. No. 2.7e-62; 

Matches 144; Conservative 1; Mismatches 3; Indels 0; Gaps 0 

142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 201 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I M I I I I I I I M I I M I I I I I 
) 31 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 90 

7 2 02 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 2 61 

I I I I I I I I I I I I I I M I I M I I I I I I I I I I I I I M I I I I M I I I I I I I I I I I I I I M I I I 
) 91 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 150 



Qy 262 VNGGVCYYIEGINQLSCKCPVGYTGDRC 289 

M I I I I I I I I I I I I I I I I I I I : I II 
Db 151 VNGGVCYYIEGINQLSCKCPNGFFGQRC 178 



RESULT 4 
US-09-398-496-8 

Sequence 8, Application US/09398496 
Patent No. 6133423 
GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 
APPLICANT: Busfield, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 
STREET: 225 Franklin Street 
CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/398 , 496 
FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753,007 
FILING DATE: 19-NOV-1996 
APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
TELEX: 

INFORMATION FOR SEQ ID NO: 8: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 4 69 amino acids 
TYPE: amino acid 
STRANDEDNESS: not relevant 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
US-09-398-496-8 



Query Match 49.3%; Score 776; DB 3; Length 469; 

Best Local Similarity 97.3%; Pred. No. 2.7e-62; 

Matches 144; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 



Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEA7y\GNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 201 

I I I M I I I I I I I I I I I I M I I I I I I I I I I I I I M I I M M I I I I I I I I I I I I I M I I I I I 

Db 31 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 90 

Qy 202 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 261 

I I I I I I I M I I M I I I I I I I I I M I I I M I I I I I I I I I I I I M I M I I M I I I I I I I I I I 

Db 91 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 150 

Qy 2 62 VNGGVCYYIEGINQLSCKCPVGYTGDRC 289 

I I I I I I I I I I I I I I I I I M I I : I II 
Db 151 VNGGVCYYIEGINQLSCKCPNGFFGQRC 178 



RESULT 5 

US-08-753-007A-32 

; Sequence 32, Application US/08753007A 
; Patent No. 6074841 
; GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 

APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
TITLE OF INVENTION: AND USES THEREFOR 
; NUMBER OF SEQUENCES: 33 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

; STREET: 225 Franklin Street 

; CITY: Boston 

; STATE: MA 

COUNTRY: US 

ZIP: 02110-2804 
; COMPUTER READABLE FORM: 

; MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/753 , 007A 

FILING DATE: 19-NOV-1996 
; CLASSIFICATION: 536 

; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 

FILING DATE: 19 -AUG- 1996 
ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 

REFERENCE/DOCKET NUMBER: 07334/022001 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617-542-5070 

TELEFAX: 617-542-8906 
; TELEX: 

; INFORMATION FOR SEQ ID NO: 32: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 647 amino acids 

; TYPE: amino acid 

STRANDEDNESS: single 
; TOPOLOGY: linear 



; MOLECULE TYPE: protein 
; FRAGMENT TYPE: internal 
US-08-753-007A-32 



Query Match 49,3%; Score 776; DB 3; Length 647; 

Best Local Similarity 97.3%; Pred. No. 4.2e-62; 

Matches 144; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 201 

I I I I I I I I I I M I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I M I M M I I I I 
Db 31 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 90 

Qy 202 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 2 61 

I M I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I M I I I I I I I I M I I I I I I 
Db 91 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 150 

Qy 2 62 VNGGVCYYIEGINQLSCKCPVGYTGDRC 28 9 

I I I I I I I I I M I I I I I I I I I I : I M 
Db 151 VNGGVCYYIEGINQLSCKCPNGFFGQRC 17 8 



RESULT 6 

US-09-398-496-32 

; Sequence 32, Application US/09398496 
; Patent No. 6133423 

GENERAL INFORMATION: 
; APPLICANT: Gearing, David P. 

APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
; TITLE OF INVENTION: AND USES THEREFOR 

NUMBER OF SEQUENCES: 33 
; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: Fish & Richardson P.C. 

; STREET: 225 Franklin Street 

; CITY: Boston 

STATE: MA 
COUNTRY: US 
; ZIP: 02110-2804 

COMPUTER READABLE FORM: 
; MEDIUM TYPE: Diskette 

; COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 
; S0FTW7VRE: FastSEQ Version 2.0 

CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/ 09/398 , 4 96 

; FILING DATE: 

; CLASSIFICATION: 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753,007 
; FILING DATE: 19-NOV-1996 

APPLICATION NUMBER: 08/699,591 
; FILING DATE: 19-AUG-1996 

ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
; REFERENCE/DOCKET NUMBER: 07334/022001 

TELECOMMUNICATION INFORMATION: 



TELEPHONE: 617-542-507 0 
TELEFAX: 617-542-8 906 
TELEX: 

INFORMATION FOR SEQ ID NO: 32: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 647 amino acids 
TYPE: amino acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
US-09-398-496-32 



Query Match 4 9.3%; 

Best Local Similarity 97.3%; 
Matches 144; Conservative 



Score 776; DB 3; Length 64 7; 
Pred. No. 4.2e-62; 
1 ; Mismatches 3 ; Indels 



0; Gaps 



0; 



Qy 

Db 

Qy 

Db 



142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 201 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I M 
31 ATRPKLKKMKSQTGQVGEKQSLKCEAT^GNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 90 

2 02 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 261 
I I I I I I I I I I I I I I I I I I M I I I I I I I I I M I I I I I I I I M I I I I I I I I I I I I I I M I I I 
91 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 150 



Qy 

Db 



2 62 VNGGVCYYIEGINQLSCKCPVGYTGDRC 2 89 

I I I I I I I I I I I M I I I I I I I I : I II 
151 VNGGVCYYIEGINQLSCKCPNGFFGQRC 17 8 



RESULT 7 

US-08-753-007A-6 

; Sequence 6, Application US/08753007A 

; Patent No. 6074841 

; GENERAL INFORMATION: 

; APPLICANT: Gearing, David P. 

; APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

; STREET: 225 Franklin Street 

CITY: Boston 
; STATE: MA 

COUNTRY: US 
ZIP: 02110-2804 
; COMPUTER READABLE FORM: 

; MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 
; OPERATING SYSTEM: DOS 

; SOFTWARE: FastSEQ Version 2.0 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/753 , 007A 
FILING DATE: 19-NOV-1996 
CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 



APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-507 0 
TELEFAX: 617-542-8 906 
TELEX: 

INFORMATION FOR SEQ ID NO: 6: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 4 07 amino acids 
TYPE: amino acid 
STRANDEDNESS : not relevant 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
US-08-753-007A-6 

Query Match 46,8%; Score 736; DB 3; Length 407; 

Best Local Similarity 97.1%; Pred. No. 9.7e-59; 

Matches 136; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I M I I I M I M I I I I I I I I I I I I I I 
Db 1 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 60 

Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 

I I I I I M I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I M I I I I I I M I I I I 
Db 61 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 12 0 

Qy 270 lEGINQLSCKCPVGYTGDRC 289 

I I I I I I I I I I M I : I II 
Db 121 lEGINQLSCKCPNGFFGQRC 14 0 



RESULT 8 
US-09-398-496-6 

; Sequence 6, Application US/09398496 
; Patent No. 6133423 

GENERAL INFORMATION: 
; APPLICANT: Gearing, David P. 
; APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE T^D POLYPEPTIDES 
; TITLE OF INVENTION: AND USES THEREFOR 

NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 
CITY: Boston 
; STATE: MA 

COUNTRY: US 
ZIP: 02110-2804 
; COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 



OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/398,4 96 
FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753,007 
FILING DATE: 19-NOV-1996 
APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
TELEX: 

INFORMATION FOR SEQ ID NO: 6: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 4 07 amino acids 
TYPE: amino acid 
STRANDEDNESS: not relevant 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
US-09-398-496-6 

Query Match 46.8%; Score 736; DB 3; Length 407; 

Best Local Similarity 97.1%; Pred. No. 9,7e-59; 

Matches 136; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 2 09 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I I I I I I I I I I I I 
Db 1 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 60 

Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 

I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I M I I I I I I I I I 
Db 61 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 12 0 

Qy 270 lEGINQLSCKCPVGYTGDRC 2 89 

I I I I I I I I I I I I I : I II 
Db 121 lEGINQLSCKCPNGFFGQRC 14 0 



RESULT 9 

US-08-753-007A-4 

; Sequence 4, Application US/08753007A 

; Patent No. 6074841 

; GENERAL INFORMATION: 

; APPLICANT: Gearing, David P. 

APPLICANT: Busfield, Samantha J. 
; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

TITLE OF INVENTION: AND USES THEREFOR 

NUMBER OF SEQUENCES: 33 
; CORRESPONDENCE ADDRESS: 



ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 

CITY: Boston 

STATE : MA 
; COUNTRY: US 

ZIP: 02110-2804 
COMPUTER READABLE FORM: 
; MEDIUM TYPE: Diskette 

; COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2,0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/753, 007A 

FILING DATE: 19-NOV-1996 

CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 08/699,591 

; FILING DATE: 19-AUG-1996 

ATTORNEY/AGENT INFORMATION: 

NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 

REFERENCE/DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
; TELEPHONE: 617-542-5070 

TELEFAX: 617-542-8906 

TELEX : 

; INFORMATION FOR SEQ ID NO: 4: 
; SEQUENCE CHARACTERISTICS: 

; LENGTH: 181 amino acids 

; TYPE: amino acid 

; STRANDEDNESS: not relevant 

TOPOLOGY: linear 
MOLECULE TYPE: protein 
; FRAGMENT TYPE: internal 
US-08-753-007A-4 

Query Match 45.5%; Score 716; DB 3; Length 181; 

Best Local Similarity 94.3%; Pred. No. 2.1e-57; 

Matches 132; Conservative 4; Mismatches 4; Indels 0; Gaps 0; 



150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 



I I I I I I • I I ) I I I I I I I I I I I I i I 111! I I I 1 I I 1 I I 1 I 

1 MKSQTGEVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNVRKNSRLQFNKV 60 




Db 



Qy 



210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 



; M M I I t I I I M M I I I I I I I t I I • I I I I I I I I I I I 

61 RVEDAGEYVCEAENILGKDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 12 0 




Db 



Qy 



270 lEGINQLSCKCPVGYTGDRC 289 



Db 



121 lEGINQLSCKCPNGFFGQRC 140 



RESULT 10 
US-09-398-496-4 

; Sequence 4, Application US/09398496 
; Patent No. 6133423 



GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 
APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE 7\ND POLYPEPTIDES 
TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.O. 
STREET: 225 Franklin Street 
CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
S0FTW7VRE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/398 , 496 
FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753,007 
FILING DATE: 19-NOV-1996 
APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8 906 
TELEX: 

INFORMATION FOR SEQ ID NO: 4: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 181 amino acids 
TYPE: amino acid 
STRANDEDNESS: not relevant 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
US-09-398-496-4 

Query Match 45.5%; Score 716; DB 3; Length 181; 

Best Local Similarity 94.3%; Pred. No. 2.1e-57; 

Matches 132; Conservative 4; Mismatches 4; Indels 0; Gaps 0; 

Qy 150 MKSQTGQVGEKQSLKCEAT^GNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 

I I M I I : I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
Db 1 MKSQTGEVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNVRKNSRLQFNKV 60 

Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 

: I I I I I I I I I M I I I I M I I I I I M : I M I I I I M I I I I I I I I I I I I I I M I I I M I M I 

Db 61 RVEDAGEYVCEAENILGKDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 120 



Qy 27 0 lEGINQLSCKCPVGYTGDRC 2 89 

I I I I I I I I I I I I I : I II 
Db 121 lEGINQLSCKCPNGFFGQRC 140 



RESULT 11 
US-08-753-007A-2 

; Sequence 2, Application US/08753007A 

; Patent No. 6074841 

; GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 

APPLICT^T: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
; TITLE OF INVENTION: AND USES THEREFOR 

NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

; STREET: 225 Franklin Street 

CITY: Boston 

STATE: MA 

COUNTRY: US 
; ZIP: 02110-2804 

COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
; COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
; CURRENT APPLICATION DATA: 

; APPLICATION NUMBER: US/08/753, 007A 

FILING DATE: 19-NOV-1996 
; CLASSIFICATION: 536 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
; FILING DATE: 19-AUG-1996 

ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

; REGISTRATION NUMBER: 32,983 

REFERENCE/ DOCKET NUMBER: 07334/022001 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617-542-5070 

TELEFT^: 617-542-8906 
; TELEX: 

; INFORMATION FOR SEQ ID NO: 2: 

; SEQUENCE CKARACTERISTICS : 

; LENGTH: 605 amino acids 

; TYPE: amino acid 

; STRANDEDNESS : not relevant 

; TOPOLOGY: linear 

; MOLECULE TYPE: protein 

; FRAGMENT TYPE: internal 

US-08-753-007A-2 



Query Match 45 . 5%; 

Best Local Similarity 94.3%; 
Matches 132; Conservative 



Score 716; DB 3; Length 605; 
Pred. No. l.le-56; 
4; Mismatches 4; Indels 0; Gaps 0; 



150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 



. I I I M I : I I I I I I I I I I I I I I I I I I I I M I I I I I I It I I I M M I I I I I I I I I I I I I I I 

Db 1 MKSQTGEVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNVRKNSRLQFNKV 60 

Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 

: I I I I I I M I I I I I I I I I I I I I I I I : I I I I I I I I I I I M M I I I I I I I I M I I I I I I I I I 
Db 61 RVEDAGEYVCEAENILGKDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 12 0 

Qy 270 lEGINQLSCKCPVGYTGDRC 289 

I M I I I I I I I I I I : I II 
Db 121 lEGINQLSCKCPNGFFGQRC 140 



RESULT 12 
US-09-398-496-2 

; Sequence 2, Application US/09398496 

; Patent No. 6133423 

; GENERAL INFORMATION: 

; APPLICANT: Gearing, David P. 

; APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
; CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 
; STREET: 225 Franklin Street 

CITY: Boston 
; STATE: MA 

; COUNTRY: US 

ZIP: 02110-2804 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
; COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 
; SOFTWARE: FastSEQ Version 2.0 

; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/ 398 , 4 96 

FILING DATE: 

CLASSIFICATION: 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753,007 

FILING DATE: 19-NOV-1996 
; APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 

REFERENCE/DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
; TELEPHONE: 617-542-5070 

; TELEFAX: 617-542-8906 

TELEX : 

; INFORMATION FOR SEQ ID NO: 2: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 605 amino acids 

; TYPE: amino acid 

; STRANDEDNESS: not relevant 

; TOPOLOGY: linear 



; MOLECULE TYPE: 
FRAGMENT TYPE: 
US-09-398-496-2 



protein 
internal 



Query Match 45.5%; Score 716; DB 3; Length 605; 

Best Local Similarity 94.3%; Pred. No. l.le-56; 

Matches 132; Conservative 4; Mismatches 4; Indels 0; Gaps 0; 

Qy 150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 

M I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MKSQTGEVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNVRKNSRLQFNKV 60 

Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 

: I I I I I I I I I I I M M I I I I I I I I I : I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I 
Db 61 RVEDAGEYVCEAENILGKDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 120 

Qy 270 lEGINQLSCKCPVGYTGDRC 289 

I I I I I I I I I I I I I : I II 
Db 121 lEGINQLSCKCPNGFFGQRC 140 



RESULT 13 
US-08-470-339-189 

; Sequence 189, Application US/08470339C 

; Patent No. 6232286 

; GENERAL INFORMATION: 

; APPLICANT: GOODEARL, ANDREW 

; APPLICANT: 3TR00BANT, PAUL 

; APPLICANT: MINGHETTI, LUISA 

; APPLICANT: WATERFIELD, MICHAEL 

; APPLICANT: MARCHIONNI, MARK 

; APPLICANT: CHEN, MARIO S. 

; APPLICANT: HILES, IAN 

; TITLE OF INVENTION: GLIAL MITOGENIC FACTORS, THEIR 
; TITLE OF INVENTION: PREPARATION AND USE 
; FILE REFERENCE: 04585/002008 

; CURRENT APPLICATION NUMBER: US/08/ 4 7 0 , 339C 

; CURRENT FILING DATE: 1995-06-06 

; EARLIER APPLICATION NUMBER: 08/036,555 

; EARLIER FILING DATE: 1993-03-24 

; EARLIER APPLICATION NUMBER: 07/940,389 

; EARLIER FILING DATE: 1992-09-03 

; EARLIER APPLICATION NUMBER: 07/907,138 

; EARLIER FILING DATE: 1992-06-30 

; EARLIER APPLICATION NUMBER: 07/863,703 

; EARLIER FILING DATE: 1992-04-03 

; EARLIER APPLICATION NUMBER: 91 07566.3 GB 

; EARLIER FILING DATE: 1999-04-10 

; NUMBER OF SEQ ID NOS : 226 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 189 
; LENGTH: 411 
; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-08-470-339-189 



Query Match 



34.6%; Score 545; DB 3; Length 411; 



Best Local Similarity 35.6%; Pred. No. 2.2e-41; 

Matches 127; Conservative 62; Mismatches 90; Indels 78; Gaps 13; 



Qy 15 GVSLACYS— PSLKSVQDQAYKAPVWEGKV QGLV PAGGSSS— NSTRE 59 

11:111 I I : I I I : I : I M : I I I I I I ' I i : I I 

Db 58 GASV-CYSSPPSVGSVQELAQRAAWIEGKVHPQRRQQGALDRKAAAAAGEAGAWGGDRE 116 

Qy 60 PPASGRVA LVKVLDKWPLRSGGLQ 83 

111:11 Mill :::|ll: 

Db 117 PPAAGPRALGPPAEEPLIAANGTVPSWPTAPVPSAGEPGEEAPYLAAKVHQVWAVKAGGLK 17 6 

Qy 84 REQVISV GSCVPLERNQRYIFFLEP TEQPLVFKTAFAPLDTNGKN 128 

: : : : : I I I j : : I I I I I : I I : I I : : I I I : I I : I 

Db 177 KDSLLTVRLGTWGHPAFPSCGRLKEDSRYIFFMEPDANSTSRAP7UVFRASFPPLET-GRN 235 

Qy 12 9 LKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS 188 

I II I I :: I I II I : I I : I I I I II I : I I : = : : | I I : I I I I I 

Db 236 LKKEVSRVLCKRCALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRK 295 

Qy 189 RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSS 245 

: : I : I : I : I I : I I : I : I I I : I : : I I I : : : : I = I 

Db 296 NKPQNIKIQKKPGK— SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNATSTS 353 

Qy 246 WSG— HARKCNETT^SYCVNGGVCYYIEGINQLS CKCPVGYTGDRCQQFAMVNF 297 

: I I III I :: I I I M I : : : : : I I I I I : I I I I I I : I : I 
Db 354 TTGTSHLVKCAEKEKTFCVNGGECFMVKDLSNPSRYLCKCPNEFTGDRCQNYVMASF 410 



RESULT 14 
US-08-467-602-324 

; Sequence 324, Application US/08467602C 

; Patent No. 6444642 

; GENERAL INFORMATION: 

; APPLICANT: Sklar, Robert 

; APPLICANT: Marchionni, Mark 

; APPLICANT: Gwynne, David I. 

; TITLE OF INVENTION: METHODS FOR TREATING MUSCLE DISEASES AND 
; TITLE OF INVENTION: DISORDERS 
; FILE REFERENCE: 04585/028003 

; CURRENT APPLICATION NUMBER: US/08/467 , 602C 

; CURRENT FILING DATE: 1995-06-06 

; E7VRLIER APPLICATION NUMBER: 08/209,204 

; EARLIER FILING DATE: 1994-03-08 

; EARLIER APPLICATION NUMBER: 08/059,022 

; EARLIER FILING DATE: 1993-05-06 

; NUMBER OF SEQ ID NOS : 42 0 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 324 
; LENGTH: 422 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-08-467-602-324 

Query Match 34.6%; Score 545; DB 4; Length 422; 

Best Local Similarity 35.6%; Pred. No. 2.3e-41; 

Matches 127; Conservative 62; Mismatches 90; Indels 78; Gaps 13; 



Qy 


15 


Db 


58 


Qy 


60 


Db 


117 


Qy 


84 


Db 


177 


Qy 


129 


Db 


236 


Qy 


189 


Db 


296 


Qy 


246 


Db 


354 



GVSLACYS — PSLKSVQDQAYKAPVWEGKV QGLV PAGGSSS — NSTRE 59 

I I : I I I M : I I I : I : I 1 I : I M I 11= I I ' II 

GASV-CYSSPPSVGSVQELAQRAAWIEGKVHPQRRQQGALDRKAAAAAGEAGAWGGDRE 116 

PPASGRVA LVKVLDKWPLRSGGLQ 83 

|||:| I MM I :::MI: 

PPAAGPRALGPPAEEPLLAANGTVPSWPTAPVPSAGEPGEEAPYLVKVHQVWAVKAGGLK 176 

REQVISV GSCVPLERNQRYIFFLEP TEQPLVFKTAFAPLDTNGKN 128 

M M : M M M M : I M M M : I M I 

KDSLLTVRLGTWGHPAFPSCGRLKEDSRYIFFMEPDANSTSRAPT^FRASFPPLET-GRN 235 



LKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS 

I 1 I M : M I M h M M M I I I MM:: : : M M I M M 

LKKEVSRVLCKRCALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRK 



295 



— RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSS 245 
: : I : M M I M M : M M M M : M M : : : I M 



WSG— HARKCNETAKSYCVNGGVCYYIEGINQLS — CKCPVGYTGDRCQQFAMVNF 2 97 
:| ::| I I I 1 I : :: :: I MM : I : I :l 



RESULT 15 
US-08-467-602-366 

; Sequence 366, Application US/08467602C 

; Patent No. 6444642 

; GENERAL INFORMATION: 

; APPLICANT: Sklar, Robert 

; APPLICANT: Marchionni, Mark 

; APPLICANT: Gwynne, David I. 

; TITLE OF INVENTION: METHODS FOR TREATING MUSCLE DISEASES AND 
; TITLE OF INVENTION: DISORDERS 
; FILE REFERENCE: 04585/028003 

; CURRENT APPLICATION NUMBER: US/08/467 , 602C 

; CURRENT FILING DATE: 1995-06-06 

; EARLIER APPLICATION NUMBER: 08/209,204 

; EARLIER FILING DATE: 1994-03-08 

; EARLIER APPLICATION NUMBER: 08/059,022 

; EARLIER FILING DATE: 1993-05-06 

; NUMBER OF SEQ ID NOS : 420 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 366 
LENGTH: 456 
; TYPE: PRT 

; ORGANISM: Homo sapiens 
; FEATURE: 

NAME/KEY: VARIANT 

LOCATION: (34) ... (34) 
; OTHER INFORMATION: Xaa is any amino acid 
US-08-467-602-366 

Query Match 34,6%; Score 545; DB 4; Length 456; 

Best Local Similarity 35.6%; Pred. No. 2.5e-41; 

Matches 127; Conservative 62; Mismatches 90; Indels 78; Gaps 



Qy 



15 GVSLACYS — PSLKSVQDQAYKAPVWEGKV QGLV PAGGSSS— NSTRE 59 

11:111 II: III: I :| 11:1111 11= 11= H 

Db 92 GASV-CYSSPPSVGSVQELAQRAAWIEGKVHPQRRQQGALDRKAAAAAGEAGAWGGDRE 150 



Qy 



60 PPASGRVA LVKVLDKWPLRSGGLQ 83 

|||:| I MM I ::MII: 

Db 151 PP7\AGPRALGPPAEEPLLAANGTVPSWPTAPVPSAGEPGEEAPYLVKVHQVWAVKAGGLK 210 



Qy 94 REQVISV GSCVPLERNQRYIFFLEP TEQPLVFKTAFAPLDTNGKN 128 

:: :::| || M : IIIMMI : I |: =1 MM Ml 

Db 211 KDSLLTVRLGTWGHPAFPSCGRLKEDSRYIFFMEPDANSTSRAPAAFRASFPPLET-GRN 269 

Qy 12 9 LKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS 188 

I II I I :: I I II I : M M I II I I I M I :: :: II I M II II 

Db 270 LKKEVSRVLCKRCALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRK 329 

Qy 189 — RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSS 245 

::1:1: |: I |: M : MMMM : II M : : : I M 
Db 330 NKPQNIKIQKKPGK— SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNATSTS 387 

Qy 246 WSG— HAJ^KCNETAKSYCVNGGVCYYIEGINQLS — CKCPVGYTGDRCQQFAMVNF 297 

:| I II 1 |::MMI M :: :: I MM MIMM : I M 
Db 388 TTGTSHLVKCAEKEKTFCVNGGECFMVKDLSNPSRYLCKCPNEFTGDRCQNYVMASF 444 
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Scoring table: BLOSUM62 
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Minimum DB seq length: 0 
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Listing first 45 summaries 
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pir3 : * 
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ALIGNMENTS 



RESULT 1 
JC5700 

ErbB kinase activator alpha, brain and thymus - human 
C; Species: Homo sapiens (man) 

C;Date: 25-Nov-1997 #sequence_revision 25-Nov-1997 #text_change 08-Sep-2002 
C;Accession: JC5700 

R;Higashiyama, S.; Horikawa, M. ; Yamada, K. ; Ichino, N.; Nakano, N.; Nakagawa, 
T.; Miyagawa, J.; Matsushita, N.; Nagatsu, T.; Taniguchi, N.; Ishiguro, H. 
J. Biochem. 122, 675-680, 1997 

A; Title: A novel brain-derived member of the epidermal growth factor family that 
interacts with ErbB3 and ErbB4 . 

A; Reference number: JC5700; MUID : 98006324 ; PMID: 9348101 
A; Accession: JC5700 

A; Status: nucleic acid sequence not shown 
A;Molecule type: mRNA 
A; Residues: 1-850 <HIG> 

A/Cross-references: DDB J: AB005060 ; NID : g2626738 ; PIDN : BAA23417 . 1 ; PID:g2626739 
A; Experimental source: SK-NSH cell 

C; Comment: This protein is a member of the epidermal growth factor family. It is 
functionally similar to neurogulin in terms of directly activating ErbB4 , 



transactivating ErbBl, B2 and B3, and stimulating the differentiation of MDA-MB- 
453 cells. 

C; Super family: human ErbB kinase activator alpha, brain and thymus; EGF 
homology; immunoglobulin homology 
C; Keywords : glycoprotein 

F;258-311/Domain: Ig-like #status predicted <IGL> 
F;345-381/Domain: EGF homology <EGF> 

F;346-381/Domain: EGF-like #status predicted <EGF2> 

F; 147, 278, 451/Binding site: carbohydrate (Asn) (covalent) #status predicted 

Query Match 95.6%; Score 1505; DB 2; Length 850; 

Best Local Similarity 98.6%; Pred. No. 1.5e-107; 

285; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I I I M I M M I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I M I I I I I I I M 

93 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 152 

61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I I I I I M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I M I I I I I I I I 

153 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 212 

121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 
213 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 272 

181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I M I M I 
273 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 332 

241 TTLSSWSGHT^KCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 289 

I I I I I I I I I I I I I I I M M I I I I I I I I I M I I I I I I M I I I I : I II 
333 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 381 



ErbB kinase activator alpha2a, brain and thymus - rat 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 25-Nov-1997 #sequence_revision 25-Nov-1997 #text_change 08-Sep-2002 
C;Accession: JC5702; PC4417 

R;Higashiyama, S.; Horikawa, M. ; Yamada, K. ; Ichino, N.; Nakano, N.; Nakagawa, 
T.; Miyagawa, J.; Matsushita, N. ; Nagatsu, T.; Taniguchi, N.; Ishiguro, H. 
J. Biochem. 122, 675-680, 1997 

A; Title: A novel brain-derived member of the epidermal growth factor family that 
interacts with ErbB3 and ErbB4 . 

A; Reference number: JC5700; MUID: 98006324 ; PMID: 9348101 
A;Accession: JC5702 

A; Status: nucleic acid sequence not shown 
A; Molecule type: mRNA 
A; Residues: 1-860 <HIG> 

A;Cross-references: DDBJ:D89996; NID : g2605631 ; PIDN : BAA23345 . 1 ; PID:g2605632 
A; Experimental source: PC-12 cell 
A; Access ion: PC4417 

A; Status : nucleic acid sequence not shown 
A;Molecule type: mRNA 

A;Residues: » F» , 212-213, 223-860 <HI2> 



Matches 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

RESULT 2 
JC5702 



A; Cross-references: DDBJ: AB001576 ; NID : g2605478 ; PIDN : BAA23348 . 1; PID:g2605479 
A; Experimental source: PC-12 cell 

C; Comment: This protein is a member of the epidermal growth factor family. It is 
functionally similar to neurogulin in terms of directly activating ErbB4, 
transactivating ErbBl, B2 and B3, and stimulating the differentiation of MDA-MB- 
453 cells. 

C; Superf amily : human ErbB kinase activator alpha, brain and thymus; EGF 
homology; immunoglobulin homology 
C; Keywords : glycoprotein 

F;274-327/Domain: Ig-like #status predicted <IGL> 
F;361-397/Domain: EGF homology <EGF> 

F; 422-444/Domain: hydrophobic #status predicted <HYD> 

F;163,294,467/Binding site: carbohydrate (Asn) (covalent) #status predicted 

Query Match 93.4%; Score 1470; DB 2; Length 860; 

Best Local Similarity 95.8%; Pred, No. 7.5e-105; 

Matches 277; Conservative 5; Mismatches 7; Indels 0; Gaps 0; 
Qy 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I M I I I I I I I I I I I I I I M M I I I I I I I I I I I I I I I I I M I I I M I I I I I M I I I I I I 

Db 109 MRRDPAPGSSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLAPAGGSSSNSTREP 168 

Qy 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I I I I I I I I I M I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I I M M I I I I I 
Db 169 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTEQPLVFKTAFA 228 

Qy 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

1:1 I I I I :! I I I I I I I I I I I I I I I I I I I I I I I I I : I M I I I I I I I I I I ! M I I I I I I I I 
Db 229 PVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAAAGNPQPSYRWFK 2 88 

Qy 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I M I I I I I I I I I M I I I M : I I I I I 
Db 28 9 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLHVNSVS 348 

Qy 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 289 

I I I I I M I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I : I II 
Db 349 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 397 



RESULT 3 
JC5701 

ErbB kinase activator alphal, brain and thymus - rat 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 25-Nov-1997 #sequence_revision 25-Nov-1997 #text_change 08-Sep-2002 
C;Accession: JC5701; PC4411 

R;Higashiyama, S.; Horikawa, M. ; Yamada, K. ; Ichino, N. ; Nakano, N. ; Nakagawa, 
T.; Miyagawa, J.; Matsushita, N.; Nagatsu, T.; Taniguchi, N.; Ishiguro, H. 
J. Biochem. 122, 675-680, 1997 

A; Title: A novel brain-derived member of the epidermal growth factor family that 
interacts with ErbB3 and ErbB4 . 

A; Reference number: JC5700; MUID : 98006324 ; PMID: 9348101 
A;Accession: JC5701 
A; Molecule type: mRNA 
A;Residues: 1-868 <HIG> 

A;Cross-references: DDBJ:D89995; NID : g2605629 ; PIDN : BAA23344 . 1 ; PID:g2605630 
A; Access ion: PC4411 
A;Molecule type: protein 



A; Residues: 128-162 <HI2> 

A; Experimental source: PC- 12 cell 

C;Comment: This protein is a member of the epidermal growth factor family. It is 
functionally similar to neurogulin in terms of directly activating ErbB4 , 
transactivating ErbBl, B2 and B3, and stimulating the differentiation of MDA-MB- 
453 cells. 

C; Superfamily : human ErbB kinase activator alpha, brain and thymus; EGF 
homology; immunoglobulin homology 
F; 361-397/Domain: EGF homology <EGF> 

Query Match 93.4%; Score 1470; DB 2; Length 868; 

Best Local Similarity 95.8%; Pred. No. 7.6e-105; 

Matches 277; Conservative 5; Mismatches 7; Indels 0; Gaps 0; 

Qy 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I I I I I I I I I I I I I I I I I I M I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 109 MRRDPAPGSSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLAPAGGSSSNSTREP 168 

Qy 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 12 0 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I I I I I I I I I I I M I I I I I I 
Db 169 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTEQPLVFKTAFA 228 

Qy 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

I : I I I I I : I I I I I I I I I I I I M I I I I I I I I I I M : I I I I I I I I M M I I I I I I I I I I I I 
Db 229 PVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAAAGNPQPSYRWFK 2 88 

Qy 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

I I I I I I I M I I I I I I I M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I M I : I I I I I 
Db 289 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLHWSVS 348 

Qy 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 2 89 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I : I M 
Db 34 9 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 397 



RESULT 4 
S32357 

glial growth factor - human 
C; Species: Homo sapiens (man) 

C;Date: 02-Dec-1993 #sequence_revision lO-Nov-1995 #text_change 08-Sep-2002 
C;Accession: S32357 

R;Marchionni, M.A. ; Goodearl, A.D.J. ; Chen, M.S.; Bermingham-McDonogh, O.; Kirk, 
C; Hendricks, M. ; Danehy, F.; Misumi, D. ; Sudhalter, J.; Kobayashi, K, ; 
Wroblewski, D.; Lynch, C; Baldassare, M. ; Hiles, I.; Davis, J.B.; Hsuan, J.J.; 
Totty, N.F.; Otsu, M. ; McBurney, R.N.; Waterfield, M.D.; Stroobant, P.; Gwynne, 
D. 

Nature 362, 312-318, 1993 

A; Title: Glial growth factors are alternatively spliced erbB2 ligands expressed 
in the nervous system. 

A; Reference number: S32357; MUID : 93205115 ; PMID: 8096067 
A; Accession: S32357 
A; Status: preliminary 
A; Molecule type: mRNA 
A; Residues: 1-422 <MAR> 

A; Cross-references: GB:L12260; NID:g292047; PIDN : AAB59622 . 1 ; PID:g292048 
C; Superfamily : human heregulin; EGF homology; immunoglobulin homology 
F;363-402/Domain: EGF homology <EGF> 



Query Match 34.6%; Score 544; DB 2; Length 422; 

Best Local Similarity 35.6%; Pred, No. 3.4e-34; 

Matches 127; Conservative 62; Mismatches 90; Indels 78; Gaps 13; 

Qy 15 GVSLACYS — PSLKSVQDQAYKAPVWEGKV QGLV PAGGSSS--NSTRE 59 

11:111 I I : I I I : I : I I I : I I I I I I : I I ' > I 

Db 58 GASV-CYSSPPSVGSVQELAQRAAWIEGKVHPQRRQQGALDRKAAAAAGEAGAWGGDRE 116 

Qy 60 PPASGRVA LVKVLDKWPLRSGGLQ 83 

111:11 I I I I I : : : I I I : 

Db 117 PPAAGPRALGPPAEEPLLAANGTVPSWPTAPVPSAGEPGEEAPYLVKVHQVWAVKAGGLK 176 

Qy 84 REQVISV GSCVPLERNQRYIFFLEP TEQPLVFKTAFAPLDTNGKN 128 

II I : : I I M I : M = I I : : I I I : I 1 = I 

Db 177 KDSLLTVRLGTWGHPAFPSCGRLKEDSRYIFFMEPDANSTSRAPAAFRASFPPLET-GRN 235 

Qy 129 LKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS 188 

I I I I I :: I I It I : I I : I I I I II I : M : : | | | : | II | I 

Db 236 LKKEVSRVLCKRCALPPQLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRK 295 

Qy 189 RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSS 245 

: : I : I : I : I I : I I : I : I I I : I : : M I : : : : I = I 

Db 296 NKPQNIKIQKKPGK— SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNATSTS 353 

Qy 246 WSG— HARKCNETAKSYCVNGGVCYYIEGINQLS CKCPVGYTGDRCQQFAMVNF 297 

: I I I I I I : : I I I I I I : : : : : I I I I I : I I I I I I : I : I 
Db 354 TTGTSHLVKCAEKEKTFCVNGGECFMVKDLSNPSRYLCKCPNEFTGDRCQNYVMASF 410 



RESULT 5 
D43273 

heregulin precursor, splice form beta-3 - human 

N; Alternate names: glial growth factor HRG-beta-3; neuregulin 

C; Species: Homo sapiens (man) 

C;Date: 31-Dec-1993 #sequence_revision 31-Dec-1993 #text_change 08-Sep-2002 
C;Accession: D43273; S32358 

R;Holmes, W.E.; Sliwkowski, M.X. ; Akita, R.W. ; Henzel, W.J.; Lee, J.; Park, 
J.W.; Yansura, D.; Abadi, N.; Raab, H. ; Lewis, G.D.; Shepard, H.M.; Kuang, W.J.; 
Wood, W.I.; Goeddel, D.V.; Vandlen, R.L. 
Science 256, 1205-1210, 1992 

A;Title: Identification of heregulin, a specific activator of pl85(erbB2). 
A; Reference number: A43273; MUID : 92271253 ; PMID: 1350381 
A;Accession: D43273 

A; Status: preliminary; nucleic acid sequence not shown; not compared with 
conceptual translation 
A;Molecule type: mRNA 
A; Residues: 1-241 <HOL> 

R;Marchionni, M.A. ; Goodearl, A.D.J. ; Chen, M.S.; Bermingham-McDonogh, O.; Kirk, 
C; Hendricks, M. ; Danehy, F. ; Misumi, D. ; Sudhalter, J,; Kobayashi, K.; 
Wroblewski, D.; Lynch, C. ; Baldassare, M. ; Hiles, I.; Davis, J.B.; Hsuan, J.J.; 
Totty, N.F.; Otsu, M. ; McBurney, R.N.; Waterfield, M.D.; Stroobant, P.; Gwynne, 
D. 

Nature 362, 312-318, 1993 

A;Title: Glial growth factors are alternatively spliced erbB2 ligands expressed 
in the nervous system, 

A;Reference number: S32357; MUID : 932 05115 ; PMID: 8096067 



A; Access ion: S32358 
A; Molecule type: rtiRNA 
A; Residues: 1-241 <MAR> 

A; Cross-references: GB:L12261; NID:g292049; PIDN : AAB59358 . 1 ; PID:g292050 

C; Genetics : 

A; Gene: GDB:HGL; GGF 

A; Cross-references: GDB: 132656; OMIM: 142445 
A;Map position: 8p22-8pll 

C;Superfamily: human heregulin; EGF homology; immunoglobulin homology 

C; Keywords: alternative splicing 

F; 182-221/Domain: EGF homology <EGF> 

Query Match 19.5%; Score 306.5; DB 2; Length 241; 

Best Local Similarity 33.0%; Pred. No. 2.9e-16; 

Matches 73; Conservative 36; Mismatches 61; Indels 51; Gaps 7; 

Qy 126 GKNLKKEVGKILCTDCAT RPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRW 178 

II I II I : I I : I I : I II I II I : I I : : : : I 

Db 11 GKGKKKERGSGKKPESAAGSQSPALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKW 7 0 

Qy 179 FKDGKELNRS RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRL- 234 

M:| MM ::|:|: |: I |: II : MIMM: : II M : 

Db 71 FKNGNELNRKNKPQNIKIQKKPGK— SELRINKASLADSGEYMCKVISKLGNDSASANIT 128 

Qy 235 YV NSVSTTLSSWSG — HARKCNETAKS 259 

II I : I : I : I M I M 1 I : 

Db 129 IVESNEIITGMPASTEGAYVSSESPIRISVSTEGANTSSSTSTSTTGTSHLVKCAEKEKT 18 8 

Qy 260 YCVNGGVCYYIEGINQLS CKCPVGYTGDRCQQFAMVNF 297 

: II II I I : : : : : I II II M II I II : I M 
Db 189 FCVNGGECFMVKDLSNPSRYLCKCPNEFTGDRCQNYVMASF 229 



RESULT 6 
C43273 

heregulin precursor, splice form beta-2 - human 
C; Species: Homo sapiens (man) 

C;Date: 31~Dec-1993 #sequence_revision 31-Dec-1993 #text_change 08-Sep-2002 
C;Accession: C43273; 138407 

R;Holmes, W.E.; Sliwkowski, M.X.; Akita, R.W. ; Henzel, W.J.; Lee, J.; Park, 
J.W.; Yansura, D. ; Abadi, N.; Raab, H.; Lewis, G.D.; Shepard, H.M.; Kuang, W.J.; 
Wood, W.I.; Goeddel, D.V. ; Vandlen, R.L, 
Science 256, 1205-1210, 1992 

A;Title: Identification of heregulin, a specific activator of pl85(erbB2). 
A; Reference number: A43273; MUID : 92271253 ; PMID: 1350381 
A; Accession: C43273 

A; Status: preliminary; nucleic acid sequence not shown; not compared with 
conceptual translation 
A;Molecule type: mRNA 
A; Residues: 1-637 <HOL> 

R;Wen, D. ; Suggs, S.V.; Karunagaran, D. ; Liu, N.; Cupples, R.L.; Luo, Y. ; 
Janssen, A.M.; Ben-Baruch, N. ; Trollinger, D.B.; Jacobsen, V.L.; Meng, S. 
Mol. Cell. Biol. 14, 1909-1919, 1994 

A;Title: Structural and functional aspects of the multiplicity of Neu 
differentiation factors . 

A;Reference number: A56210; MUID : 941588 63 ; PMID:7509448 
A;Accession: I3B407 



A; status: preliminary; translated from GB/EMBL/DDBJ 

A; Molecule type: mRNA 

A; Residues: 119-406 <RES> 

A; Cross-references: EMBL:U02329; NID:g408408; PIDN: AAA19954 . 1; PID:g408409 

C; Genetics: 

A; Gene: GDB:HGL 

A;Cross-references : GDB: 132656; OMIM: 142445 
A;Map position: 8p22-8pll 

C; Superf amily: human heregulin; EGF homology; immunoglobulin homology 

C; Keywords: alternative splicing 

F; 182-221/Domain: EGF homology <EGF> 

Query Match 19.5%; Score 306.5; DB 2; Length 637; 

Best Local Similarity 33.0%; Pred. No. 9.1e-16; 

Matches 73; Conservative 36; Mismatches 61; Indels 51; Gaps 7; 

Qy 126 GKNLKKEVGKILCTDCAT RPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRW 17 8 

II MM : I I : I I : M I I II I : I I : : : : I 

Db 11 GKGKKKERGSGKKPESAAGSQSPALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKW 7 0 

Qy 179 FKDGKELNRS RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRL- 234 

I I : I I I I I : : I : I : I : I I : M : I : I I I : I : : I I I : : 

Db 71 FKNGNELNRKNKPQNIKIQKKPGK — SELRINKASLADSGEYMCKVISKLGNDSASANIT 128 

Qy 235 YV NSVSTTLSSWSG— HARKCNETAKS 259 

I I I : I : I : I : I I III I : 

Db 129 IVESNEIITGMPASTEGAYVSSESPIRISVSTEGANTSSSTSTSTTGTSHLVKCAEKEKT 188 

Qy 260 YCVNGGVCYYIEGINQLS CKCPVGYTGDRCQQFAMVNF 297 

: I I I I I I : : : : : I I I I I : I I I I I I : I : I 
Db 189 FCVNGGECFMVKDLSNPSRYLCKCPNEFTGDRCQNYVMASF 229 



RESULT 7 
B43273 

heregulin, splice form beta 1 - human 
C; Species: Homo sapiens (man) 

C;Date: 31-Dec-1993 #sequence_revision 31-Dec-1993 #text_change 08-Sep-2002 
C;Accession: B43273; 138406 

R;Holmes, W.E.; Sliwkowski, M.X.; Akita, R.W. ; Henzel, W.J.; Lee, J.; Park, 
J.W.; Yansura, D.; Abadi, N.; Raab, H.; Lewis, G.D.; Shepard, H.M. ; Kuang, W.J.; 
Wood, W.I.; Goeddel, D.V.; Vandlen, R.L. 
Science 256, 1205-1210, 1992 

A;Title: Identification of heregulin, a specific activator of pl85(erbB2). 
A; Reference number: A43273; MUID : 9227 1253 ; PMID:1350381 
A;Accession: B43273 

A; Status: preliminary; nucleic acid sequence not shown; not compared with 
conceptual translation 
A; Molecule type: mRNA 
A; Residues: 1-645 <HOL> 

R;Wen, D. ; Suggs, S.V.; Karunagaran, D. ; Liu, N.; Cupples, R.L.; Luo, Y. ; 
Janssen, A.M.; Ben-Baruch, N.; Trollinger, D.B.; Jacobsen, V.L.; Meng, S. 
Mol. Cell. Biol. 14, 1909-1919, 1994 

A; Title: Structural and functional aspects of the multiplicity of Neu 
differentiation factors. 

A; Reference number: A56210; MUID : 94158863 ; PMID:7509448 
A;Accession: 138406 



A; status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type : mRNA 

A;Residues: »A\ 95-418 F 420-645 <RES> 

A;Cross-references : EMBL:U02328; NID:g408406; PIDN : AAA19953 . 1 ; PID:g408407 

C; Genetics : 

A; Gene: GDB:HGL 

A;Cross-references : GDB: 132656; OMIM: 142445 
A;Map position: 8p22-8pll 

C; Superf amily : human heregulin; EGF homology; immunoglobulin homology 

C; Keywords: alternative splicing 

F; 182-221/Domain: EGF homology <EGF> 

Query Match 19.5%; Score 306.5; DB 2; Length 645; 

Best Local Similarity 33.0%; Pred. No. 9.2e-16; 

Matches 73; Conservative 36; Mismatches 61; Indels 51; Gaps 7; 

Qy 126 GKNLKKEVGKILCTDCAT RPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRW 178 

II INI : I I : II : III I I I I : I I : : : : I 

Db 11 GKGKKKERGSGKKPESTU^GSQSPALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKW 70 

Qy 179 FKDGKELNRS RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRL- 234 

11:11111 : : I : I : I : I I : II : I : I II : I : : M I : : 

Db 71 FKNGNELNRKNKPQNIKIQKKPGK— SELRINKASLADSGEYMCKVISKLGNDSASANIT 128 

Qy 235 YV NSVSTTLSSWSG— H7VRKCNETAKS 259 

II I : I : I : I : I | | | j | : 

Db 129 IVESNEIITGMPASTEGAYVSSESPIRISVSTEGANTSSSTSTSTTGTSHLVKCAEKEKT 188 

Qy 2 60 YCVNGGVCYYIEGINQLS CKCPVGYTGDRCQQFAMVNF 297 

: I I I I I I : : : : : I I I I I : I I I I I I : I : I 
Db 189 FCVNGGECFMVKDLSNPSRYLCKCPNEFTGDRCQNYVMASF 22 9 



RESULT 8 
A45769 

acetylcholine receptor synthesis stimulator ARIA-1 precursor - chicken 
C; Species: Gallus gallus (chicken) 

C;Date: 20-Feb-1995 #sequence_revision 20-Feb-1995 #text_change 08-Sep-2002 
C; Access ion : A45769 

R;Falls, D.L.; Rosen, K.M.; Corfas, G. ; Lane, W.S.; Fischbach, G.D. 
Cell 72, 801-815, 1993 

A;Title: ARIA, a protein that stimulates acetylcholine receptor synthesis, is a 
member of the neu ligand family. 

A; Reference number: A45769; MUID : 93201602 ; PMID: 8453670 
A; Accession: A45769 
A; Status : preliminary 
A;Molecule type: mRNA; protein 
A; Residues: 1-602 <FAL> 

A;Cross-references : GB:L11264; NID:g212603; PIDN :AAA4 9037 . 1 ; PID:g212604 
A; Experimental source: brain 

A;Note: sequence extracted f rom NCBI backbone (NCBIN : 127787 , NCBIP : 127788 ) 
C; Superf amily : human heregulin; EGF homology; immunoglobulin homology 



Query Match 19.4%; Score 305.5; DB 2; Length 602; 

Best Local Similarity 33.0%; Pred. No. le-15; 

Matches 65; Conservative 37; Mismatches 74; Indels 21; Gaps 5; 



Qy 109 TEQPLVFKTAFAPLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAA 168 

: I I I : I II : I I I I : I I : I | | : I I : | I 

Db 5 SEGPLQYSLAPTQTDVNS SYNTVPPKLKEMKNQEVAVGQKLVLRCETT 52 

Qy 169 AGNPQPSYRWFKDGKEL NRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAEMIL 225 

: I : : I I : I I I : I I : : : I : I I I : : I | I I I I : I 

Db 53 SEYPALRFKWLKNGKEITKKNRPENVKIP-KKQKKYSELHIYRATLADAGEYACRVSSKL 111 

Qy 226 GKDTVRGRLYVNSVSTTLSSWSG— HARKCNETAKSYCVNGGVCYYIEGI NQLSCKC 280 

II:: : : : I : I : I I II : I : : I I I I I II : : : : I : I 
Db 112 GNDSTKASVIITDTNATSTSTTGTSHLTKCDIKQKAFCVNGGECYMVKDLPNPPRYLCRC 171 

Qy 281 PVGYTGDRCQQFAMVNF 2 97 

I : I I I I II : I : I 
Db 172 PNEFTGDRCQNYVMASF 18 8 



RESULT 9 
A56210 

neu differentiation factor - rat (fragment) 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 02-Jul-1996 #sequence_revision 02-Jul-1996 #text_change 08-Sep-2002 
C;Accession: A56210 

R;Wen, D.; Suggs, S.V.; Karunagaran, D. ; Liu, N.; Cupples, R.L.; Luc, Y. ; 
Janssen, A.M.; Ben-Baruch, N.; Trollinger, D.B.; Jacobsen, V.L.; Meng, S. 
Mol. Cell. Biol. 14, 1909-1919, 1994 

A;Title: Structural and functional aspects of the multiplicity of Neu 
differentiation factors . 

A; Reference number: A56210; MUID: 94158863; PMID: 7509448 
A;Accession: A56210 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A; Residues: 1-230 <RES> 

A;Cross-references: EMBL:U02315; NID:g408380; PIDN:AAA19940 . 1; PID:g408381 
C; Super family : human heregulin; EGF homology; immunoglobulin homology 

Query Match 19.0%; Score 299; DB 2; Length 230; 

Best Local Similarity 33.8%; Fred. No. le-15; 

Matches 67; Conservative 34; Mismatches 53; Indels 44; Gaps 6; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS RDIRIKYGNG 198 

I I : I I : I I I I II I : I I : : : : I I I : I I I I I : I : I : I 

Db 23 ALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNKPENIKIQKKPG 82 

Qy 199 RKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRL YV 236 

: I I : I I : I : I M : I : : I I I : : II 
Db 83 K — SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNEFITGMPASTETAYVSSE 140 

Qy 237 NSVSTTLSSWSG — HARKCNETAKSYCVNGGVCYYIEGINQLS^ CK 279 

I : I : I : I : I I III I : : I I I I I I : : : : : I II 
Db 141 SPIRISVSTEGANTSSSTSTSTTGTSHLIKCAEKEKTFCVNGGECFTVKDLSNPSRYLCK 200 

Qy 28 0 CPVGYTGDRCQQFAMVNF 297 

II : I I I I I I : I : I 

Db 201 CPNEFTGDRCQNYVMASF 218 



RESULT 10 
161718 

neu differentiation factor - rat 

C; Species: Rattus norvegicus (Norway rat) 

C;Date: 29-May-1998 #sequence_revision 29-May-1998 #text_change 08-Sep^2002 
C;Accession: 161718; 161721; 161720 

R;Wen, D. ; Suggs, S.V,; Karunagaran, D.; Liu, N,; Cupples, R.L.; Luo, Y. ; 
Janssen, A.M.; Ben-Baruch, N.; Trollinger, D.B.; Jacobsen, V.L.; Meng, S. 
Mol, Cell. Biol. 14, 1909-1919, 1994 

A;Title: Structural and functional aspects of the multiplicity of Neu 
differentiation factors. 

A; Reference number: A56210; MUID : 94 158 863 ; PMID : 7509448 
A;Accession: 161718 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A; Residues: 1-636 <RES> 

A;Cross-references : EMBL:U02318; NID:g408386; PIDN : AAA19943 . 1 ; PID:g408387 
A; Access ion : 161721 

A; Status: preliminary; translated from GB/EMBL/DDBJ 

A;Molecule type: mRNA 

A; Residues: 1-444 A* , 44 6-636 <RE2> 

A;Cross-references : EMBL:U02321; NID:g408392; PIDN : AAA1994 6 . 1 ; PID:g408393 
A; Access ion : 16172 0 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 

A;Residues: 1-298 , 386, , 388 TR' , 391 <RE3> 

A;Cross-references : EMBL:U02320; NID:g408390; PIDN : AAA19945 . 1 ; PID:g408391 
C; Super family : human heregulin; EGF homology; immunoglobulin homology 
F;182-221/Domain: EGF homology <EGF> 

Query Match 19.0%; Score 299; DB 2; Length 636; 

Best Local Similarity 33.8%; Pred. No. 3.4e-15; 

Matches 67; Conservative 34; Mismatches 53; Indels 44; Gaps 6; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS RDIRIKYGNG 198 

I I : I I : I I I I II I : M : : :: I I I : I I I I I : I : I : I 

Db 34 ALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNKPENIKIQKKPG 93 

Qy 199 RKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRL YV 236 

: I I : I I : I : I I I : I : : II I : : I | 

Db 94 K — SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNEFITGMPASTETAYVSSE 151 

Qy 237 ; NSVSTTLSSWSG — HARKCNETAKSYCVNGGVCYYIEGINQLS CK 279 

I : I : I : I : I I III I : : I I I I I I : : : : : | | | 
Db 152 SPIRISVSTEGANTSSSTSTSTTGTSHLIKCAEKEKTFCVNGGECFTVKDLSNPSRYLCK 211 

Qy 280 CPVGYTGDRCQQFAMVNF 2 97 

II : I I I I I I : I : I 

Db 212 CPNEFTGDRCQNYVMASF 229 



RESULT 11 
161722 

neu differentiation factor - rat 

C; Species: Rattus norvegicus (Norway rat) 

C;Date: 29-May-1998 #sequence_revision 29-May-1998 #text_change 08-Sep-2002 
C; Accession: 161722 



R;Wen, D. ; Suggs, S.V.; Karunagaran^ D. ; Liu, N. ; Cupples, R.L.; Luo, Y. ; 
Janssen, A.M.; Ben-Baruch, N.; Trollinger, D.B.; Jacobsen, V.L.; Meng, S. 
Mol. Cell. Biol. 14, 1909-1919, 1994 

A; Title: Structural and functional aspects of the multiplicity of Neu 
differentiation factors. 

A;Reference number: A56210; MUID : 94 1588 63 ; PMID:7509448 
A;Accession: 161722 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-662 <RES> 

A;Cross-references: EMBL:U02322; NID:g408394; PIDN : AAA19947 . 1; PID:g408395 
C; Super family : human heregulin; EGF homology; immunoglobulin homology 
F; 182-221/Domain: EGF homology <EGF> 

Query Match 19.0%; Score 299; DB 2; Length 662; 

Best Local Similarity 33.8%; Fred. No. 3.6e-15; 

Matches 67; Conservative 34; Mismatches 53; Indels 44; Gaps 6; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS RDIRIKYGNG 198 

I I : I I : I I I I II I : I I : : : : I I I : I I I I I : I : I : I 

Db 34 ALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNKPENIKIQKKPG 93 

Qy 199 RKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRL YV 236 

: I I : II : I : I I I : I : : I I I : : M 
Db 94 K — SELRINKASLADSGEYMCPCVISKLGNDSASANITIVESNEFITGMPASTETAYVSSE 151 

Qy 237 NSVSTTLSSWSG — HARKCNETAKSYCVNGGVCYYIEGINQLS CK 279 

I : I: I : I : I I III I : : I I I I I I : : : : : I I | 
Db 152 SPIRISVSTEGANTSSSTSTSTTGTSHLIKCAEKEKTFCVNGGECFTVKDLSNPSRYLCK 211 

Qy 28 0 CPVGYTGDRCQQFAMVNF 297 

II : II II I I : I : I 

Db 212 CPNEFTGDRCQNYVMASF 229 



RESULT 12 
S32359 

glial growth factor - bovine 

C; Species: Bos primigenius taurus (cattle) 

C;Date: 19-Mar-1997 #sequence_revision Ol-Aug-1997 #text_change 08-Sep-2002 
C; Accession: S32359 

R;Marchionni, M.A. ; Goodearl, A.D.J. ; Chen, M.S.; Bermingham-McDonogh, 0.; Kirk, 
C; Hendricks, M. ; Danehy, F. ; Misumi, D. ; Sudhalter, J.; Kobayashi, K. ; 
Wroblewski, D.; Lynch, C; Baldassare, M. ; Hiles, I.; Davis, J.B.; Hsuan, J.J.; 
Totty, N.F.; Otsu, M. ; McBurney, R.N.; Waterfield, M.D.; Stroobant, P.; Gwynne, 
D. 

Nature 362, 312-318, 1993 

A;Title: Glial growth factors are alternatively spliced erbB2 ligands expressed 
in the nervous system. 

A;Reference number: S32357; MUID : 93205115 ; PMID:8096067 
A;Accession: S32359 
A; Status: preliminary 
A; Molecule type: mRNA 
A; Residues: 1-241 <MAR> 

A; Cross-references: GB:L12259; NID:g289413; PIDN : AAA3054 0 . 1 ; PID:g289414 
C; Superf amily : human heregulin; EGF homology; immunoglobulin homology 
F; 182-221/Domain: EGF homology <EGF> 



Query Match 18.6%; Score 292; DB 2; Length 241; 

Best Local Similarity 32.8%; Pred. No. 3,8e-15; 

Matches 65; Conservative 37; Mismatches 52; Indels 44; Gaps 



6; 



Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKEL NRSRDIRIKYGNG 198 

I I : I I : I I M I I I : I I : : : : I I | : | | | | : : : | : | : | 

Db 34 ALPPRLKEMKSQESVAGSKLVLRCETSSEYSSLKFKWFKNGSELSRKNKPQNIKIQKRPG 93 

Qy 199 RKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRL YV 236 

: I I : : I : I : I I I : t : : I I I : r II 

Db 94 K — SELRISKASLADSGEYMCKVISKLGNDSAST^ITIVESNEITTGMPASTETAYVSSE 151 

Qy 237 NSVSTTLSSWSG — HARKCNETAKSYCVNGGVCYYIEGINQLS CK 279 

I : I : I : I : I I III I : : II I M I : : : : : I II 
Db 152 SPIRISVSTEGTNTSSSTSTSTAGTSHLVKCAEKEKTFCVNGGECFMVKDLSNPSRYLCK 211 

Qy 280 CPVGYTGDRCQQFAMVNF 2 97 

II : I I I I I I : I : I 

Db 212 CPNEFTGDRCQNYVMASF 22 9 



RESULT 13 
138404 

neu differentiation factor - human 
C; Species: Homo sapiens (man) 

C;Date: 29-May-1998 #sequence_revision 29-May-1998 #text_change 08-Sep-2002 
C; Accession: 138404 

R;Wen, D. ; Suggs, S.V.; Karunagaran, D.; Liu, N. ; Cupples, R.L.; Luo, Y. ; 
Janssen, A.M.; Ben-Baruch, N. ; Trollinger, D.B.; Jacobsen, V.L.; Meng, S. 
Mol. Cell. Biol. 14, 1909-1919, 1994 

A;Title: Structural and functional aspects of the multiplicity of Neu 
differentiation factors. 

A; Reference number: A56210; MUID: 94158863; PMID: 7509448 
A; Accession: 138404 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A; Residues: 1-462 <RES> 

A;Cross-references : EMBL:U02326; NID:g408402; PIDN : AAA19951 . 1 ; PID:g408403 
C;Superf amily: human heregulin; EGF homology; immunoglobulin homology 

Query Match 17.8%; Score 280.5; DB 2; Length 462; 

Best Local Similarity 32.1%; Pred. No. 6.2e-14; 

Matches 69; Conservative 35; Mismatches 60; Indels 51; Gaps 7; 

Qy 126 GKNLKKEVGKILCTDCAT RPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRW 178 

II I I I I : I I : I I : II I I II I : II : : : : I 

Db 11 GKGKKKERGSGKKPESAAGSQSPALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKW 70 

Qy 179 FKDGKELNRS RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRL- 234 

M : I I I I I : : I : I : I : I I : II : I : M I : I : : II I : : 

Db 71 FKNGNELNRKNKPQNIKIQKKPGK— SELRINKASLADSGEYMCKVISKLGNDSASANIT 128 

Qy 235 YV NSVSTTLSSWSG— HARKCNETAKS 259 

II I : I : I : I : I I I I I I : 

Db 129 IVESNEIITGMPASTEGAYVSSESPIRISVSTEGANTSSSTSTSTTGTSHLVKCAEKEKT 188 



Qy 260 YCVNGGVCYYIEGINQLS CKCPVGYTGDRCQQ 291 

: I I I I I I : : : : : I III I : I I I I : 
Db 18 9 FCVNGGECFMVKDLSNPSRYLCKCQPGFTGARCTE 223 



RESULT 14 
A43273 

heregulin precursor, splice form alpha - human 

N;Alternate names: breast cancer cell differentiation factor p45; Neu 

differentiation factor 

C; Species: Homo sapiens (man) 

C;Date: 31-Dec-1993 #sequence_revision 31-Dec-1993 #text_change 08-Sep-2002 
C;Accession: A43273; A48498; A38155 

R;Holmes, W.E.; Sliwkowski, M-X.; Akita, R.W. ; Henzel, W.J.; Lee, J.; Park, 
J.W.; Yansura, D.; Abadi, N. ; Raab, H.; Lewis, G.D.; Shepard, H.M.; Kuang, W.J.; 
Wood, W.I,; Goeddel, D.V. ; Vandlen, R.L, 
Science 256, 1205-1210, 1992 

A;Title: Identification of heregulin, a specific activator of pl85{erbB2). 
A;Reference number: A43273; MUID : 92271253 ; PMID: 1350381 
A;Accession: A43273 

A; Status: nucleic acid sequence not shown; not compared with conceptual 

translation 

A; Molecule type: mRNA 

A; Residues: 1-64 0 <HOL> 

A; Experimental source: breast tumor cell line, MDA-MB-231, ATCC HTB 26 
A;Note: sequence extracted from NCBI backbone (NCBIP : 103250) 
R;Culouscou, J.M. ; Plowman, G.D.; Carlton, G.W. ; Green, J.M. ; Shoyab, M. 
J. Biol. Chem. 268, 18407-18410, 1993 

A;Title: Characterization of a breast cancer cell differentiation factor that 
specifically activates the HER4/pl80 (erbB4 ) receptor. 
A;Reference number: A48498; MUID : 93366731 ; PMID:7689552 
A; Accession: A484 98 
A;Molecule type: protein 

A;Residues: 20-21, ,23-24, »XX* , 27-28 <CUL> 

R;Peles, E.; Bacus, S.S.; Koski, R.A, ; Lu, H.S.; Wen, D.; Ogden, S,G.; Levy, 

R.B.; Yarden, Y, 

Cell 69, 205-216, 1992 

A;Title: Isolation of the neu/HER-2 stimulatory ligand: a 44 kd glycoprotein 

that induces differentiation of mammary tumor cells. 

A; Reference number: A38155; MUID: 92208945; PMID: 1348215 

A;Accession: A38155 

A;Molecule type: protein 

A; Residues: 'X', 15-16, 'X', 18-20, *RG', 23-24, 'GP»,27,'E',29, 'XP',32-36 <PEL> 

A;Note: sequence extracted from NCBI backbone (NCBIP : 91347 ) 

C; Genetics : 

A; Gene: GDB:HGL 

A; Cross -references : GDB: 132656; OMIM: 142445 
A;Map position: 8p22-8pll 

C; Superf amily : human heregulin; EGF homology; immunoglobulin homology 
C; Keywords: alternative splicing; glycoprotein 
F; 182-221/Domain: EGF homology <EGF> 

Query Match 17.8%; Score 279.5; DB 2; Length 640; 

Best Local Similarity 32.1%; Pred. No. l.le-13; 

Matches 69; Conservative 35; Mismatches 60; Indels 51; Gaps 7; 



Qy 126 GKNLKKEVGKILCTDCAT RPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRW 178 



' ■ I I - I f • I I I I II I ; I I . . . . I 

11 GKGKKKERGSGKKPESAAGSQSPALPPQLKEMKSQESAAGSKLVLRCETSSEYSSLRFKW 70 

Qy 179 FKDGKELNRS RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRL- 234 

11:11111 ::|:|: |: I |: II : |:|||:|: : II |: : 

71 FKNGNELNRKNKPQNIKIQKKPGK--SELRINKASLADSGEYMCKVISKLGNDSASANIT 128 

Qy 235 YV NSVSTTLSSWSG— HARKCNETAKS 259 

II I : I : I : I : I I I I I I : 

Db 129 IVESNEIITGMPASTEGAYVSSESPIRISVSTEGANTSSSTSTSTTGTSHLVKCAEKEKT 188 

Qy 260 YCVNGGVCYYIEGINQLS CKCPVGYTGDRCQQ 291 

: I I I I I I : : : : : I III I : I I I I : 
Db 189 FCVNGGECFMVKDLSNPSRYLCKCQPGFTGARCTE 223 



RESULT 15 
161719 

neu differentiation factor - rat 

C; Species: Rattus norvegicus (Norway rat) 

C;Date: 29-May-1998 #sequence_revision 29-May-1998 #text_change 08-Sep-2002 
C;Accession: 161719; 161723; 161716; 161717; 161724; A38220 
R;Wen, D. ; Suggs, S.V.; Karunagaran, D.; Liu, N. ; Cupples, R.L.; Luo, Y. ; 
Janssen, A.M.; Ben-Baruch, N. ; Trollinger, D.B,; Jacobsen, V.L,; Meng, S. 
Mol. Cell. Biol. 14, 1909-1919, 1994 

A; Title: Structural and functional aspects of the multiplicity of Neu 
differentiation factors, 

A; Reference nuinber: A56210; MUID : 94158863 ; PMID: 7509448 
A;Accession: 161719 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A; Residues: 1-639 <RES> 

A; Cross-references : EMBL:U02319; NID:g408388; PIDN : AAA19944 . 1 ; PID:g408389 
A;Accession: 161723 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-639 <RE2> 

A;Cross-references: EMBL:U02323; NID:g408396; PIDN: AAA1994 8 . 1 ; PID:g408397 
A; Access ion: 161716 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 

A; Residues: 1-422, 'H\ *NL^ 637-638, ' ELRRNKAYRSKCMQIQLSATHLRPSSITHLGFIL ' <RE3> 
A;Cross-references : EMBL:U02316; NID:g408382; PIDN : AAA19941 . 1 ; PID:g408383 
A; Accession : 161717 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 

A; Residues: 1-422, 'H*, ' NL 637-638 , ' ELRRNKAYRSKCMQIQLSATHLRPSSITHLGFIL * <RE4> 
A;Cross-references: EMBL:U02317; NID:g408384; PIDN : AAA19942 . 1 ; PID:g408385 
A;Accession: 161724 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A; Residues: 1-422 <RE5> 

A;Cross-references : EMBL:U02324; NID:g408398; PIDN : AAA19949 . 1 ; PID:g408399 
R;Wen, D.; Peles, E.; Cupples, R. ; Suggs, S.V.; Bacus, S.S.; Luo, Y.; Trail, G. ; 
Hu, S.; Silbiger, S.M.; Levy, R.B.; Koski, R.A, ; Lu, H.S.; Yarden, Y. 
Cell 69, 559-572, 1992 



A; Title: Neu differentiation factor: a transmembrane glycoprotein containing an 

EGF domain and an immunoglobulin homology unit. 

A;Reference number: A38220; MUID : 92257596 ; PMID: 1349853 

A; Accession: A3822 0 

A; Status : preliminary 

A; Molecule type: mRNA 

A; Residues: 1-422 <WEN> 

A;Note: sequence extracted from NCBI backbone (NCBIN : 101767 , NCBIP : 101768 ) 
C; Superfamily: human heregulin; EGF homology; immunoglobulin homology 

Query Match 17.3%; Score 273; DB 2; Length 639; 

Best Local Similarity 32.8%; Pred. No. 3.4e-13; 

Matches 63; Conservative 33; Mismatches 52; Indels 44; Gaps 6; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS RDIRIKYGNG 198 

I I : I I : I I I I II I : I I : : : : | | I : I I I I I : I : I : I 

Db 34 ALPPRLKEMKSQEST^GSKLVLRCETSSEYSSLRFKWFKNGNELNRKNKPENIKIQKKPG 93 

Qy 199 RKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRL YV 236 

: I I : II : I : I M : I : : I I I : : | | 

Db 94 K--SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNEFITGMPASTETAYVSSE 151 

Qy 237 NSVSTTLSSWSG — HARKCNETAKSYCVNGGVCYYIEGINQLS CK 279 

I : I : I : I : I I III I : : I I I I I I : : : : : I I I 
Db 152 SPIRISVSTEGANTSSSTSTSTTGTSHLIKCAEKEKTFCVNGGECFTVKDLSNPSRYLCK 211 

Qy 2 80 CPVGYTGDRCQQ 291 

I I : I I I I : 
Db 212 CQPGFTGARCTE 223 



Search completed: August 17, 2004, 14:13:20 
Job time : 14.7611 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence : 



August 11, 2004, 14:12:46 ; Search time 40.3344 Seconds 

(without alignments) 
2319.368 Million cell updates/s 

US-09-864-675-4 
1574 

1 MRRDPAPGFSMLLFGVSLAC KCPVGYTGDRCQQFAMVNFS 298 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 



1292805 seqs, 313927144 residues 



Total number of hits satisfying chosen parameters; 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



1292805 



Database 



Published_Applications__AA: * 
1 : /cgn2_6/ptodata/l/pubpaa/US07_PUBCOMB.pep : * 
2 : / cgn2_6/ptodata/l/pubpaa/PCT_NEW_PUB.pep: * 
3 : /cgn2_6/ptodata/l/pubpaa/US06_NEW_PUB.pep : * 
4 : / cgn2_6/ptodata/l/pubpaa/US06_PUBCOMB .pep : * 
5 : /cgn2_6/ptodata/l/pubpaa/US07_NEW_PUB.pep: ^ 
6 : / cgn2_6/ptodata/ l/pubpaa/PCTUS_PUBCOMB . pep : * 
7 : /cgn2_6/ptodata/l/pubpaa/US08_NEW_PUB.pep: * 
8 : /cgn2_6/ptodata/ l/pubpaa/US08_PUBCOMB . pep : ^ 
/cgn2_6/ptodata/l/pubpaa/US09A_PUBCOMB.pep: * 
/cgn2_6/ptodata/l/pubpaa/US09B_PUBCOMB.pep: ^ 
/cgn2_6/ptodata/l/pubpaa/US09C_PUBCOMB.pep: ^ 
/ cgn2_6/ptodata/ l/pubpaa/US09_NEW__PUB . pep : * 
/cgn2_6/ptodata/l/pubpaa/US10A_PUBCOMB.pep: ^ 
/ cgn2_6/ptodata/l/pubpaa/US10B_PUBCOMB.pep: ^ 
/cgn2_6/ptodata/l/pubpaa/US10C_PUBCOMB.pep: ^ 
/ cgn2_6/ptodata/ l/pubpaa/US10_NEW_PUB . pep : * 
/cgn2_6/ptodata/l/pubpaa/US60_NEW_PUB.pep: * 
/cgn2_6/ptodata/ l/pubpaa/US60_PUBCOMB , pep : * 



9 

10 
11 
12 
13 
14 
15 
16 
17 
18 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 



Result Query 

No. Score Match Length DB 



ID 



Description 
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32 


.0 


139 


13 
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23 
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14 
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9 


US-09-792-025-11 


22 
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20 


6 
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9 


US-09-849-868-11 


23 
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20 


6 
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9 


US-09-808-602-85 


24 
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20, 


6 
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14 


US-10-290-578-2 


25 


325 


20. 


6 
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14 


US-10-453-183-11 


26 
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20. 


1 
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9 


US-09-795-668-2 


27 


317 


20. 


1 
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9 


US-09-795-686-2 


28 


317 


20. 


1 


192 


9 


US-09-946-807-2 


29 


306. 5 


19. 


5 
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9 


US-09-795-668-18 


30 


306.5 


19. 


5 


239 


9 


US-09-795-686-18 


31 


306. 5 


19. 


5 


239 


9 


US-09-946-807-18 


32 


306.5 


19. 


5 
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9 


US-09~795-668-14 


33 


306.5 


19. 


5 


629 


9 


US-09-795-686-14 


34 


306.5 


19. 


5 


629 


9 


US-09-946-807-14 


35 


306.5 


19. 


5 
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9 


US-09-795-668-13 


36 


306.5 


19. 


5 
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9 


US-09-795-686-13 


37 


306.5 
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5 
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9 


US-09-946-807-13 


38 
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5 
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13 


US-10-096-241-10 


39 
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4 
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9 


US-09-773-517-7 


40 
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4 
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9 


US-09-792-025-7 


41 
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4 
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9 


US-09-849-868-7 


42 
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4 
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14 


US-10-453-183-7 


43 
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4 
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9 


US-09-773-517-9 


44 
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4 
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9 


US-09-792-025-9 
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305.5 
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4 
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9 


US-09-849-868-9 



Appl 
Appl 



Sequence 4, Appli 
Sequence 2, Appli 
Sequence 3, Appli 
Sequence 2, Appli 
Sequence 610, App 
Sequence 8, Appli 
Sequence 32, Appl 
Sequence 6, Appli 
Sequence 4, Appli 
Sequence 2, Appli 
Sequence 170, App 
Sequence 3, Appli 
Sequence 3, Appli 
Sequence 3, Appli 
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ALIGNMENTS 



RESULT 1 
US-09-864-675-4 

; Sequence 4, Application US/09864675 

; Patent No. US20020081286A1 

; GENERAL INFORMATION: 

; APPLICANT: Marchionni, Mark 



; TITLE OF INVENTION: NRG-2 NUCLEIC ACID MOLECULES, 

; TITLE OF INVENTION: POLYPEPTIDES, AND DIAGNOSTIC AND THERAPEUTIC METHODS 

; FILE REFERENCE: 04585/049002 

; CURRENT APPLICATION NUMBER: US/ 0 9/ 8 64 , 675 

; CURRENT FILING DATE: 2001-05-23 

; PRIOR APPLICATION NUMBER: US 60/206,495 

; PRIOR FILING DATE: 2000-05-23 

; NUMBER OF SEQ ID NOS : 18 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 4 

LENGTH: 298 

TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-864-675-4 



Query Match 100,0%; Score 1574; DB 9; Length 298; 

Best Local Similarity 100.0%; Pred. No. 3,3e-121; 

Matches 2 98; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


1 


MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

MM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 

MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 


Db 


1 


Qy 


61 


PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 
M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 
PASGRV7VLVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 


Db 


61 


Qy 


121 


PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

1 N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 

PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 


Db 


121 


Qy 


181 


DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 
N 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 1 1 1 
DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 


Db 


181 


Qy 


241 


TTLSSWSGH7VRKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRCQQFAMVNFS 298 

1 1 1 N 1 1 1 1 M M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRCQQFAMVNFS 298 


Db 


241 



. RESULT 2 
US-09~864-675-2 

; Sequence 2, Application US/09864675 

; Patent No. US20020081286A1 

; GENERAL INFORMATION: 

; APPLICANT: Marchionni, Mark 

; TITLE OF INVENTION: NRG-2 NUCLEIC ACID MOLECULES, 

; TITLE OF INVENTION: POLYPEPTIDES, AND DIAGNOSTIC AND THERAPEUTIC METHODS 

; FILE REFERENCE: 04585/049002 

; CURRENT APPLICATION NUMBER: US/09/864,675 

; CURRENT FILING DATE: 2001-05-23 

; PRIOR APPLICATION NUMBER: US 60/206,495 

; PRIOR FILING DATE: 2000-05-23 

; NUMBER OF SEQ ID NOS: 18 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 2 

LENGTH: 330 



TYPE: PRT 
; ORGANISM: Homo sapiens 
US-09-864-675-2 



Query Match 95.6%; Score 1505; DB 9; Length 330; 

Best Local Similarity 98.6%; Fred. No. 1.8e-115; 

Matches 285; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 
I I I I I M I I I I I M I I M I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 12 0 
I I I I I M I I I I I I I M I I I I M I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I 
PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 12 0 

PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

I I I I I I I I M M I I I M I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I I 

PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 
DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I I I I I I I I I I M M I I M M I I I I I I M I 

DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 24 0 

TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 289 
I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I : I II 
TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 2 89 



Qy 


1 


Db 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 



RESULT 3 

US-10-447-839A-3 

; Sequence 3, Application US/10447839A 

; Publication No. US20040018 181A1 

; GENERAL INFORMATION: 

; APPLICANT: Kufe, Donald W. 

; APPLICANT: Kharbanda, Surender 

; APPLICANT: Weitman, Steven D. 

; TITLE OF INVENTION: MUCl INTERFERENCE RNA COMPOSITIONS AND METHODS DERIVED 
THEREFROM 

; FILE REFERENCE: 1000,05.009 

; CURRENT APPLICATION NUMBER: US/ 10/4 47 , 839A 

; CURRENT FILING DATE: 2003-05-29 

; PRIOR APPLICATION NUMBER: 10/293,391 

PRIOR FILING DATE: 2002-11-13 
; PRIOR APPLICATION NUMBER: 09/951,938 

PRIOR FILING DATE: 2001-09-11 
; PRIOR APPLICATION NUMBER: 60/231,841 
; PRIOR FILING DATE: 2000-09-11 
; NUMBER OF SEQ ID NOS: 109 
; SOFTWARE: PatentIn version 3.2 
; SEQ ID NO 3 

LENGTH: 422 
TYPE: PRT 

ORGANISM: Homo Sapien 
US-10-447-839A-3 



Query Match 



95.6%; Score 1505; DB 15; Length 422; 



Best Local Similarity 98.6%; Pred. No. 2.4e-115; 

Matches 285; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 



Qy 


1 


Db 


93 


Qy 


61 


Db 


153 


Qy 


121 


Db 


213 


Qy 


181 


Db 


273 


Qy 


241 


Db 


333 



MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I I I I I I I I I I I I I I I M I I I M M I I I M I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 152 

PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 12 0 
I M I I I I I M I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 212 

PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 
I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I 
PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCETW^GNPQPSYRWFK 272 

DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 
M I I I I I I I I M I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 332 

TTLSSWSGHARKCNETTVKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 28 9 
I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I II 
TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 381 



RESULT 4 

US-10-447-839A-2 

; Sequence 2, Application US/10447839A 

; Publication No. US20040018181A1 

; GENER7VL INFORMATION: 

; APPLICANT: Kufe, Donald W. 

; APPLICANT: Kharbanda, Surender 

; APPLICANT: Weitman, Steven D. 

; TITLE OF INVENTION: MUCl INTERFERENCE RNA COMPOSITIONS AND METHODS DERIVED 
THEREFROM 

; FILE REFERENCE: 1000.05.009 

; CURRENT APPLICATION NUMBER: US/10/447, 839A 

; CURRENT FILING DATE: 2003-05-29 

; PRIOR APPLICATION NUMBER: 10/293,391 

; PRIOR FILING DATE: 2002-11-13 

; PRIOR APPLICATION NUMBER: 09/951,938 

; PRIOR FILING DATE: 2001-09-11 

; PRIOR APPLICATION NUMBER: 60/231,841 

; PRIOR FILING DATE: 2000-09-11 

; NUMBER OF SEQ ID NOS : 109 

SOFTWARE: Patent In version 3.2 
; SEQ ID NO 2 
; LENGTH: 426 
TYPE: PRT 

ORGANISM: Homo Sapien 
US-10-447-839A-2 

Query Match 95.6%; Score 1505; DB 15; Length 426; 

Best Local Similarity 98.6%; Pred. No. 2.4e-115; 

Matches 285; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I I I I M I I I I I I I I I I I I M I I I I I I I I I I I II I I I M II I II I II I I I I I I I I I I I I I I 



Db 



93 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 152 



Qy 


61 


Db 


153 


Qy 


121 


Db 


213 


Qy 


181 


Db 


273 


Qy 


241 


Db 


333 



PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 12 0 
I I I I M I I I I M I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 
PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 212 

PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 
I M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I M I 
PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 272 

DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 24 0 
M I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 332 

TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 2 89 
I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I II 
TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 381 



RESULT 5 

US-10-408-765A-610 

Sequence 610, Application US/10408765A 
Publication No, US20040101874A1 
GENERAL INFORMATION: 
APPLICANT: Ghosh, Soumitra S. 
APPLICANT: Fahy, Eoin D. 
APPLICANT: Zhang, Bing 
APPLICANT: Gibson, Bradford W. 
APPLICANT: Taylor, Steven W. 
APPLICANT: Glenn, Gary M. 
APPLICANT: Warnock, Dale E. 

TITLE OF INVENTION: TARGETS FOR THERAPEUTIC INTERVENTION 
TITLE OF INVENTION: IDENTIFIED IN THE MITOCHONDRIAL PROTEOME 
FILE REFERENCE: 660088.465 

CURRENT APPLICATION NUMBER: US/ 10/4 08 , 7 65A 
CURRENT FILING DATE: 2003-04-04 
NUMBER OF SEQ ID NOS : 3077 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 610 
LENGTH: 850 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-408-765A-610 



Query Match 95.6%; Score 1505; DB 16; Length 850; 

Best Local Similarity 98.6%; Pred. No. 5.8e-115; 

Matches 285; Conservative 1; Mismatches 3; Indels 0; 



Qy 

Db 

Qy 

Db 

Qy 



Gaps 



0; 



93 



61 



MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I I I I I I I I I I IN I II I I I M I II II I I M I II I I I I I I I I I M II II I I I I I I I I I I M 

MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 152 



120 



153 



PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 

I N M I I I I I I I I I M I I II I I I II I I I I M I I I I II I I I I II I I I I I II I I I 

PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 212 



121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 18 0 



213 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 2 72 
Qy 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

UN I I I I I I I M M I I I I I I I I I I M I I I I I M I I I I I I I M I I 

Db 273 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 332 

Qy 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 289 

I M M I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I : I II 

Db 333 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 381 



RESULT 6 
US-10-096-241-8 

; Sequence 8, Application US/10096241 
; Publication No. US20020127594A1 

GENERAL INFORMATION: 
; APPLICANT: Gearing, David P. 

; Bus field, Samantha J. 

; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

; AND USES THEREFOR 

NUMBER OF SEQUENCES: 33 
CORRESPONDENCE TVDDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 
; CITY: Boston 

STATE: MA 
; COUNTRY: US 

ZIP: 02110-2804 
; COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
; COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 10/ 096, 24 1 
FILING DATE: 12-Mar-2 0 02 
CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
; ATTORNEY/AGENT INFORMATION: 

' NAME: Fasse, J, Peter 

; REGISTRATION NUMBER: 32,983 

r REFERENCE/ DOCKET NUMBER: 07334/022001 

; TELECOMMUNICATION INFORMATION: 

' TELEPHONE: 617-542-5070 

: TELEFAX: 617-542-8906 

' TELEX: <Unknown> 

: INFORMATION FOR SEQ ID NO: 8: 
' SEQUENCE CHARACTERISTICS: 

' LENGTH: 469 amino acids 

TYPE: amino acid 
STRANDEDNESS: not relevant 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 



SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
US-10-096-241-8 



Query Match 49.3%; Score 776; DB 13; Length 469; 

Best Local Similarity 97.3%; Pred. No. 2.4e-55; 

Matches 144; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 
Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 201 

N I I I I I I I I M I I I I I I I I I I I M I I I I I I I I I I M I I I I I I I I I I I I 

31 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 90 

Qy 202 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 261 

I I I M I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I M I { I I I I M I I M I 
Db 91 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 150 

Qy 262 VNGGVCYYIEGINQLSCKCPVGYTGDRC 28 9 

I I I I I I I I I I I I I I I I I I I I I : I II 
Db 151 VNGGVCYYIEGINQLSCKCPNGFFGQRC 178 



RESULT 7 

US-10-096-241-32 

; Sequence 32, Application US/10096241 
; Publication No. US20020127594A1 

GENERAL INFORMATION: 
; APPLICANT: Gearing, David P. 

; Busfield, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 
STREET: 225 Franklin Street 
CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 
; COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/096, 241 
FILING DATE: 12-Mar-2002 
CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
; REGISTRATION NUMBER: 32,983 

REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
; TELEPHONE: 617-542-5070 

TELEFAX: 617-542-8906 
TELEX: <Unknown> 



INFORMATION FOR SEQ ID NO: 32: 
SEQUENCE CH7VRACTERISTICS: 

LENGTH: 64 7 amino acids 
TYPE: amino acid 
STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
US-10-096-241-32 

Query Match 49.3%; Score 776; DB 13; Length 647; 

Best Local Similarity 97.3%; Pred. No. 3.6e-55; 

Matches 144; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 201 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 31 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 90 

Qy 2 02 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 261 

I M I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 
Db 91 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 150 

Qy 2 62 VNGGVCYYIEGINQLSCKCPVGYTGDRC 289 

I I I I I I I I I I I I I I I I I I I I I : I II 
Db 151 VNGGVCYYIEGINQLSCKCPNGFFGQRC 178 



RESULT 8 
US-10-096-241-6 

; Sequence 6, Application US/10096241 
; Publication No. US20020127594A1 

GENERAL INFORMATION: 
; APPLICANT: Gearing, David P. 

; Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
; AND USES THEREFOR 

NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

; STREET: 225 Franklin Street 

CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/ 10/ 096, 241 

FILING DATE: 12-Mar-2002 
; CLASSIFICATION: <Unknown> 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
; FILING DATE: 19-AUG-1996 



ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
TELEX: <Unknown> 
INFORMATION FOR SEQ ID NO: 6: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 4 07 amino acids 
TYPE: amino acid 
STRANDEDNESS: not relevant 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
US-10-096-241-6 

Query Match 46.8%; Score 736; DB 13; Length 407; 

Best Local Similarity 97.1%; Pred. No. 3.9e-52; 

Matches 136; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 

M M I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 60 

Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 

I I M M I I I I I I I I I I I I I I I I M I I I I I I I I I I { I I I I I I I I I I I I I I I I I I I I I M I I 
Db 61 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 120 

Qy 270 lEGINQLSCKCPVGYTGDRC 289 

I I I I I I I I I I I I I : I I I 
Db 121 lEGINQLSCKCPNGFFGQRC 14 0 



RESULT 9 
US-10-096-241-4 

; Sequence 4, Application US/10096241 
; Publication No. US20020127594A1 

GENERAL INFORMATION: 
; APPLICANT: Gearing, David P. 

; Busfield, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
; NUMBER OF SEQUENCES: 33 

; CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.O. 
; STREET: 225 Franklin Street 

CITY: Boston 

STATE: MA 
; COUNTRY: US 

ZIP: 02110-2804 
; COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 



I 



SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 10/ 096, 241 
FILING DATE: 12-Mar-2002 
CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8 9 06 
TELEX : <Unknown> 
INFORMATION FOR SEQ ID NO: 4: 
SEQUENCE CH7VRACTERISTICS: 

LENGTH: 181 amino acids 
TYPE: amino acid 
STRANDEDNESS: not relevant 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
US-10-096-241-4 

Query Match 45,5%; Score 716; DB 13; Length 181; 

Best Local Similarity 94.3%; Pred. No. 6.2e-51; 

Matches 132; Conservative 4; Mismatches 4; Indels 0; Gaps 0; 

Qy 150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNPCV 209 

I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I M I I I I 
Db 1 MKSQTGEVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNVRKNSRLQFNKV 60 

Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 

: I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 RVEDAGEYVCEAENILGKDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 12 0 

Qy 270 lEGINQLSCKCPVGYTGDRC 289 

I I M I I I I I I I I I : I M 
Db 121 lEGINQLSCKCPNGFFGQRC 14 0 



RESULT 10 
US-10-096-241-2 

; Sequence 2, Application US/10096241 
; Publication No. US20020127594A1 

GENERAL INFORMATION: 
; APPLICANT: Gearing, David P. 

; Busfield, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 
STREET: 225 Franklin Street 



CITY: Boston 
; STATE: MA 

; COUNTRY: US 

ZIP: 02110-2804 
COMPUTER READABLE FORM: 
; MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
; SOFTWARE: FastSEQ Version 2,0 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/096,241 
; FILING DATE: 12-Mar-2002 

CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,9 83 
REFERENCE/ DOCKET NUMBER: 07334/022001 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
; TELEX: <Unknown> 

INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 
; LENGTH: 605 amino acids 

TYPE: amino acid 
; STRANDEDNESS: not relevant 

TOPOLOGY: linear 
; MOLECULE TYPE: protein 

FRAGMENT TYPE: internal 
SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
US-10-096-241-2 

Query Match 45.5%; Score 716; DB 13; Length 605; 

Best Local Similarity 94.3%; Pred. No. 2.8e-50; 

Matches 132; Conservative 4; Mismatches 4; Indels 0; Gaps 0 

Qy 150 MKSQTGQVGEKQSLKCE7WVGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 2 09 

I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 

1 l^KSQTGEVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNVRKNSRLQFNKV 60 

Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 2 69 

= I I N I I I I I I I I I I I I I I I I I I I I : I I I M I M I I I I I I I I I I I I I I I I { I I I I I I I I I 
Db 61 RVEDAGEYVCEAENILGKDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 120 

Qy 27 0 lEGINQLSCKCPVGYTGDRC 2 89 

I I I I I I I I I I I I I : I II 
Db 121 lEGINQLSCKCPNGFFGQRC 14 0 



RESULT 11 
US-08-736-019-170 

; Sequence 170, Application US/08736019 
; Publication No. US2 0030207799A1 
; GENERAL INFORMATION: 



APPLICANT: Goodearl, Andrew 
APPLICANT: Stroobant, Paul 
APPLICANT: Minghetti, Luisa 
APPLICANT: Waterfield, Michael 
APPLICANT: Marchionni, Mark 
APPLICANT: Chen, Mario 
APPLICANT: Hiles, Ian 

TITLE OF INVENTION: GLIAL MITOGENIC FACTORS, THEIR 
TITLE OF INVENTION: PREPARATION AND USE 
NUMBER OF SEQUENCES: 189 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Clark & Elbing LLP 
STREET: 176 Federal Street 
CITY: Boston 
STATE: Massachusetts 
COUNTRY: U.S.A. 
ZIP: 02110 
COMPUTER READABLE FORM: 

MEDIUM TYPE: 3.5" Diskette, 1.44 Mb 
COMPUTER: IBM Compatible Pentium 
OPERATING SYSTEM: Windows95 
SOFTWARE: FastSeq Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/ 736, 019 
FILING DATE: 22-OCT-1996 
CLASSIFICATION: 514 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/471,833 
FILING DATE: 06-JUN-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/036,555 
FILING DATE: 24-MAR-1993 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/965,173 
FILING DATE: 23-OCT-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/907,138 
FILING DATE: 30-JUN-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/940,389 
FILING DATE: 03-SEP-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/863,703 
FILING DATE: 03-APR-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: UK 91 07566.3 
FILING DATE: lO-APR-1991 
ATTORNEY/AGENT INFORMATION: 

NAME: Bieker-Brady, Kristina 
REGISTRATION NUMBER: 39,109 
REFERENCE/DOCKET NUMBER: 04585/00200Q 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (617) 428-0200 
TELEFAX: (617) 428-7045 
TELEX: 

INFORMATION FOR SEQ ID NO: 170: 
SEQUENCE CHARACTERISTICS; 



; LENGTH: 422 

r TYPE : amino acid 

STRANDEDNESS: 
TOPOLOGY: linear 
US-08-736-019-170 



Query Match 34,6%; Score 544; DB 8; Length 422; 

Best Local Similarity 35.6%; Pred. No. 2.5e-36; 

Matches 127; Conservative 62; Mismatches 90; Indels 78; Gaps 13; 

Qy 15 GVSLACYS — PSLKSVQDQAYKAPVWEGKV QGLV PAGGSSS— NSTRE 59 

11:111 I I : I I I : I : I M : I M I | | : | | : | { 

^8 GASV-CYSSPPSVGSVQELAQRAAWIEGKVHPQRRQQGALDRKAAAAAGEAGAWGGDRE 116 

Qy 60 PPASGRVA LVKVLDKWPLRSGGLQ 83 

, ' " • ' I I I I I I : : : I I I : 

117 PPAAGPRALGPPT^EPLLAANGTVPSWPTAPVPSAGEPGEEAPYLVKVHQVWAVKAGGLK 176 

Qy 84 REQVISV GSCVPLERNQRYIFFLEP TEQPLVFKTAFAPLDTNGKN 12 8 

'''\ M I : : I I I I I : I I : | I : : I | | : | | : | 

177 KDSLLTVRLGTWGHPAFPSCGRLKEDSRYIFFMEPDANSTSRAPAAFRASFPPLET-GRN 235 

Qy 12 9 LKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS 18 8 

mil II l:||:|lll II |:|| :: ::|M:| MM 

236 LKKEVSRVLCKRCALPPQLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRK 295 

Qy 189 RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSS 245 

: I : I : I : I M M : h M M I : : M M : : : | : | 
Db 296 NKPQNIKIQKKPGK— SELRINKASL7VDSGEYMCKVISKLGNDSASANITIVESNATSTS 353 

Qy 246 WSG— HARKCNETAKSYCVNGGVCYYIEGINQLS— CKCPVGYTGDRCQQFAMVNF 297 

= 1 I Ml MM M M M :: :: I MM M M M I : I : I 
354 TTGTSHLVKCAEKEKTFCVNGGECFMVKDLSNPSRYLCKCPNEFTGDRCQNYVMASF 410 



RESULT 12 
US-09-795-668-3 

; Sequence 3, Application US/09795668 

; Patent No. US2002 004 5577A1 

; GENERAL INFORMATION: 

; APPLICANT: Stefansson, Hreinn 

; APPLICANT: Steinthorsdottir, Valgerdur 

; APPLICANT: Gulcher, Jeffrey R. 

; TITLE OF INVENTION: HUMAN SCHIZOPHRENIA GENE 

FILE REFERENCE: 234 5.2004-001 
; CURRENT APPLICATION NUMBER: US/ 09/7 95 , 668 
; CURRENT FILING DATE: 2001-02-28 
; PRIOR APPLICATION NUMBER: US 09/515,716 
; PRIOR FILING DATE: 2000-02-28 
; NUMBER OF SEQ ID NOS : 1531 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 3 

LENGTH: 418 
; TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-795-668-3 



Query Match 34.4%; Score 542; DB 9; Length 418; 

Best Local Similarity 35,2%; Pred. No. 3.6e-36; 

Matches 125; Conservative 62; Mismatches 92; Indels 76; Gaps 



Qy 


15 


GVSLACYSPSLKSVQDQAYKAPAAA/EGKV QGLV PAGGSSS— NSTREPP 

1 1 : 1 1 1 : 1 1 1 : 1 : 1 1 1 : 1 1 1 1 | | ; | | ; | | | | 
GASV-CSPPSVGSVQELAQRAAWIEGfCVHPQRRQQGALDRKAAAAAGEAGAWGGDREPP 


61 


Db 


56 


114 


Qy 


62 


1 = 11 1 1 1 1 1 : : : 1 1 1 : : : 
AAGPRALGPPAEEPLLAANGTVPSWPTAPVPSAGEPGEEAPYLVKVHQVWAVKAGGLKKD 


85 


Db 


115 


174 


Qy 


86 


QVISV GSCVPLERNQRYIFFLEP TEQPLVFKTAFAPLDTNGKNLK 

= = = 1 II 1 : : 11 1 1 1 : M : 1 1 : : 1 1 1 : 1 I : I I | 
SLLTVRLGTWGHPAFPSCGRLKEDSRYIFFMEPDANSTSRAPAAFRASFPPLET-GRNLK 


130 


Db 


175 


233 


Qy 


131 


KEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS— 

M 1 * ' 1 1 II 1 ' 1 1 • 1 II 1 II 1 . 1 1 . . 1 1 1 1 I t 1 ) 
' ' 1 • ■ t 1 1 1 1 ' 1 1 • 1 1 1 1 1 1 1 : I 1 : : : : | | | : | | | { | 

KEVSRVLCKRCALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNK 


188 


Db 


234 


293 


Qy 


189 


-RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWS 
• : 1 • 1 : 1 : 1 1 : I | : I : | | | : | : : | | | : : : : | : | : 

PQNIKIQKKPGK— SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNATSTSTT 


247 


Db 


294 


351 


Qy 


248 


G— HARKCNETAKSYCVNGGVCYYIEGINQLS — CKCPVGYTGDRCQQFAMVNF 297 
1 1 N 1 |::||||| 1: :: :: | |||| :|||||| : | :| 

GTSHLVKCTIEKEKTFCVNGGECFMVKDLSNPSRYLCKCPNEFTGDRCQNYVMASF 4 06 




Db 


352 





RESULT 13 
US-09-795-686-3 

; Sequence 3, Application US/09795686 

; Patent No. US20020094954A1 

; GENERAL INFORMATION: 

; APPLICANT: Stefansson, Hreinn 

; APPLICANT: Steinthorsdottir , Valgerdur 

; APPLICANT: Gulcher, Jeffrey R. 

; TITLE OF INVENTION: HUMAN SCHIZOPHRENIA GENE 

; FILE REFERENCE: 2345.2005-001 

; CURRENT APPLICATION NUMBER: US/ 09/ 7 95 , 68 6 

; CURRENT FILING DATE: 2001-02-2 8 

; PRIOR APPLICATION NUMBER: US 09/515,715 

; PRIOR FILING DATE: 2000-02-28 

; NUMBER OF SEQ ID NOS : 1531 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 3 

LENGTH: 418 

TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-795-686-3 

Query Match 34.4%; Score 542; DB 9; Length 418; 

Best Local Similarity 35.2%; Pred. No. 3.6e-36; 

Matches 125; Conservative 62; Mismatches 92; Indels 76; Gaps 12; 
Qy 1^ GVSLACYSPSLKSVQDQAYKAPVWEGKV QGLV PAGGSSS— NSTREPP 61 



Db 


56 


GASV-CSPPSVGSVQEIAQRAAVVIEGKVHPQRRQQGALDRKATWyVGEAGAWGGDREPP 


114 


Qy 


62 




85 


Db 


115 


1 = 11 1 1 1 1 1 : : : 1 1 1 : : : 
AAGPRALGPPAEEPLLAANGTVPSWPTAPVPSAGEPGEEAPYLVKVHQWAVKAGGLKKD 


174 


Qy 


86 


QVISV GSCVPLERNQRYIFFLEP TEQPLVFKTAFAPLDTNGKNLK 

• : : 1 II 1 : : 1 1 1 1 1 : 1 1 : I | : : | { | : | | : | | | 
SLLTVRLGTWGHPAFPSCGRLKEDSRYIFFMEPDANSTSRAPAAFRASFPPLET-GRNLK 


130 


Db 


175 


233 


Qy 


131 


KEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS— 
1 1 1 : : M M 1 : 1 1 : 1 1 1 1 II 1 : 1 1 : : : : 1 1 1 : 1 1 1 1 I 
KEVSRVLCKRCALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNK 


188 


Db 


234 


293 


Qy 


189 


-RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWS 

!• II* II -i.iii-i- >iii. .. .1 1 
• • 1 • 1 - 1 • 11*11 • 1 : 1 1 1 : 1 : : i 1 1 : : : : I : | : 

PQNIKIQKKPGK--SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNATSTSTT 


247 


Db 


294 


351 


Qy 


248 


G— HARKCNETAKSYCVNGGVCYYIEGINQLS CKCPVGYTGDRCQQFAMVNF 297 

1 1 II 1 I:: : :: :: | |||| :||MII : 1 :| 

GTSHLVKCAEKEKTFCVNGGECFMVKDLSNPSRYLCKCPNEFTGDRCQNYVMASF 4 06 




Db 


352 





RESULT 14 
US-09-946-807-3 

; Sequence 3, Application US/09946807 

; Patent No. US20020165144A1 

; GENERAL INFORMATION: 

; APPLICANT: Stefansson, Hreinn 

; APPLICANT: Steinthorsdottir , Valgerdur 

; APPLICANT: Gulcher, Jeffrey R. 

; TITLE OF INVENTION: HUMAN SCHIZOPHRENIA GENE 

; FILE REFERENCE: 2345,2004-001 

; CURRENT APPLICATION NUMBER: US/ 09/ 94 6, 8 07 

; CURRENT FILING DATE: 2001-09-05 

; PRIOR APPLICATION NUMBER: US/ 09/7 95 , 668 

; PRIOR FILING DATE: 2001-02-28 

; PRIOR APPLICATION NUMBER: US 09/515,716 

; PRIOR FILING DATE: 2000-02-28 

; NUMBER OF SEQ ID NOS : 1531 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 3 

LENGTH: 418 

TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-946-807-3 

Query Match 34.4%; Score 542; DB 9; Length 418; 

Best Local Similarity 35.2%; Pred. No. 3.6e-36; 

Matches 125; Conservative 62; Mismatches 92; Indels 76; Gaps 12; 

Qy 15 GVSLACYSPSLKSVQDQAYKAPVWEGKV QGLV PAGGSSS— NSTREPP 61 

I I • I I I : I I I : I : I I I : I M I I I : I | : | | | | 

56 GASV-CSPPSVGSVQELAQRTU^WIEGKVHPQRRQQGALDRKAAAAAGEAGAWGGDREPP 114 

Qy 62 ASGRVA 

I : I I 



LVKVLDKWPLRSGGLQRE 8 5 



Db 



115 AAGPRALGPPAEEPLIAANGTVPSWPTAPVPSAGEPGEEAPYLVKVHQVWAVKAGGLKKD 174 



Qy 86 QVISV GSCVPLERNQRYIFFLEP TEQPLVFKTAFAPLDTNGKNLK 130 

:::| II |: : lllll:|| : I I : : I I I : I I : I I I 

Db 175 SLLTVRLGTWGHPAFPSCGRLKEDSRYIFFMEPDANSTSRAPAAFRASFPPLET-GRNLK 233 

Qy 131 KEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS — 18 8 

I I I : : I I II I : II : II II II I : I I : : : : I I I : I I I I I 

Db 234 KEVSRVLCKRCALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNK 293 

Qy 189 -RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWS 247 

: : I : I : I : I I : II : I : I I I : I : : I I I : : : : I : I : 

Db 294 PQNIKIQKKPGK— SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNATSTSTT 351 

Qy 248 G — HARKCNETAKSYCVNGGVCYYIEGINQLS CKCPVGYTGDRCQQFAMVNF 297 

I I III I : : I I I I I I : : : : : I I I I I : I II I II : I : I 
Db 352 GTSHLVKCAEKEKTFCVNGGECFMVKDLSNPSRYLCKCPNEFTGDRCQNYVMASF 406 



RESULT 15 
US-10-096-241-33 

; Sequence 33, Application US/10096241 
; Publication No. US20020127594A1 
; GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 
; Busfield, Samantha J. 

; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
; CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 
; STREET: 225 Franklin Street 

CITY: Boston 
; STATE: MA 

; COUNTRY: US 

ZIP: 02110-2804 
COMPUTER READABLE FORM: 
; MEDIUM TYPE: Diskette 

; COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/10/096,241 

; FILING DATE: 12-Mar-2002 

CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
; FILING DATE: 19-AUG-1996 

; ATTORNEY/AGENT INFORMATION: 

; NAME: Fasse, J. Peter 

; REGISTRATION NUMBER: 32,983 

REFERENCE/DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-507 0 
; TELEFAX: 617-542-8906 

; TELEX: <Unknown> 

INFORMATION FOR SEQ ID NO: 33: 



SEQUENCE CHARACTERISTICS: 

LENGTH: 139 amino acids 
TYPE: amino acid 
; STRANDEDNESS: not relevant 

; TOPOLOGY: linear 

MOLECULE TYPE: protein 
SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
US-10-096-241-33 

Query Match 32.0%; Score 504; DB 13; Length 139; 

Best Local Similarity 93.9%; Pred. No. 1.2e-33; 

Matches 92; Conservative 3; Mismatches 3; Indels 0; Gaps 0; 

Qy 192 RIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHAR 251 

I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I M I I I I I 
Db 1 RIKYGNGRKNSRLQFNKVRVEDAGEYVCEAENILGKDTVRGRLHVNSVSTTLSSWSGHAR 60 

Qy 252 KCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 289 

I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I : I II 
Db 61 KCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 98 



Search completed: August 17, 2004, 14:22:30 
Job time : 41.3344 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



August 17, 2004, 14:05:35 ; Search time 33.6911 Seconds 

(without alignments) 
2790.781 Million cell updates/sec 

US-09-864-675-4 
1574 

1 MRRDPAPGFSMLLFGVSLAC KCPVGYTGDRCQQFT^MVNFS 298 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



1017041 seqs, 315518202 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



1017041 



Database 



SPTREMBL 25:* 



1 
2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 



sp__archea : * 
sp_bacteria : * 
sp_f ungi : * 
sp_human: * 
sp_invertebrate : * 
sp_mammal : * 
sp__mhc : * 
sp_organelle : * 
sp_phage : * 

sp_plant : * 

sp_rodent : * 

sp_virus : * 

sp_vertebrate : * 

sp_unclassif led : * 

sp_rvirus : * 

sp_bacteriap : * 

sp_archeap : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result Query 

No. Score Match Length DB ID 



Description 



1 


506 


32 . 


1 


782 


11 


Q9ESA5 


Q9esa5 rattus norv 


2 


476 


30 . 


2 


342 


11 


Q9ESA1 


Q9esal rattus norv 


3 


473 


30 . 


1 


323 


11 


Q9ESA2 


Q9esa2 rattus norv 


4 


471 


29 . 


9 


317 


11 


Q9ESA3 


Q9esa3 rattus norv 


5 


404 . 5 


25 . 


7 


348 


4 


Q8NFN3 


Q8nfn3 homo sapien 


6 


292 


18 . 


6 


241 


6 


Q07112 


Q07112 bos taurus 


7 


277 


17 . 




461 


11 


035947 


035947 mesocricetu 


8 


269 


17 . 


1 


54 


11 


Q810X1 


Q810xl mus itiusculu 


9 


234 


1 4 

X T . 


9 


211 


11 


08BKI8 


Q8bkx8 mus musculu 


10 


221 


14 . 


0 


244 


11 


Q9ESA4 


Q9esa4 rattus norv 


11 


200 


12 . 


7 


79 


11 


Q810X2 


Q810x2 mus musculu 


12 


180.5 


11 . 


5 


167 


4 


08NFN2 


Q8nfn2 homo sapien 


13 


166 


10 . 


5 


8625 


5 


Q8 6GD6 


Q86gd6 procambarus 


14 


164.5 


10 . 


5 


5175 


5 


08I 0L3 


Q8i013 caenorhabdi 


15 


164 . 5 


10 . 


5 


5198 


5 


076518 


076518 caenorhabdi 


16 


158 


10 . 


0 


1106 


4 


08WX93 


Q8wx93 homo sapien 


1 7 


153.5 


9 . 


8 


8943 


5 


Q9V4F7 


Q9v4f7 drosophila 


18 


152 


9 , 


7 


296 


4 


Q96IB3 


Q96ib3 homo sapien 


J. -7 


151 . 5 


9 _ 




111 


11 


09ESA8 


Q93sa8 rattus norv 




1 SI S 


Q 

-7 • 




6658 


5 


076281 


076281 drosophila 


21 


151 


9 , 




298 


11 


Q9JI59 


Q9ji59 mus musculu 




1 SI 

J- X 


9 . 


5 


298 


11 


08C5K9 


Q8c5k9 mus musculu 


^ o 


151 


9 . 




410 


4 


08N1M2 


Q8nlm2 homo sapien 




1 SI 


9 _ 




1323 


13 


008476 


Q08476 gallus gall 


25 


150 


9 , 


5 


76 


11 


Q810X0 


Q810xO mus musculu 


^ \j 


150 


9 . 


5 


256 


11 


09ESA6 


Q9esa6 rattus norv 


27 


150 


9 . 


5 


296 


11 


Q8BX76 


Q8bx76 mus musculu 


2 8 


150 


9 . 


5 


298 


11 


08CE95 


Q8ce95 mus musculu 


29 


150 


9 , 


5 


700 


11 


Q9ESB1 


Q9esbl rattus norv 


30 


150 


9 . 


5 


1200 


11 


Q8VD07 


Q8vd07 mus musculu 


31 


14 9 


9 . 


5 


136 


11 


09ESA7 


Q9esa7 rattus norv 


32 


146 , 5 


9 . 


3 


330 


13 


Q90Z42 


Q90z42 gallus gall 


33 


144 . 5 


9 . 


2 


754 


11 


Q8BZ76 


Q8bz76 mus musculu 


o H 


X rl rt 


Q 

Z/ . 


1 

X 


338 


4 


Q8IV4 9 


Q8iv49 homo sapien 


35 


143 . 5 


9 . 


1 


507 


4 


Q96K90 


Q96k90 homo sapien 




14^ S 


9 . 


]_ 


1320 


4 


096KF5 


Q96kf5 homo sapien 


o / 


143 . 5 


9 , 


1 


1320 


4 


Q86TC9 


Q86tc9 homo sapien 


38 


143 . 5 


9 . 


1 


1391 


4 


Q8N3L4 


Q8n314 homo sapien 


39 


143.5 


9 . 


1 


1612 


11 


089026 


089026 mus musculu 


40 


143.5 


9 , 


1 


1651 


4 


Q9Y6N7 


Q9y6n7 homo sapien 


41 


140.5 


8. 


9 


1651 


11 


055005 


055005 rattus norv 


42 


140.5 


8. 


9 


2013 


11 


Q8VHZ8 


Q8vhz8 rattus norv 


43 


140,5 


8. 


9 


2013 


11 


Q9ERC8 


Q9erc8 mus musculu 


44 


140.5 


8. 


9 


3950 


6 


Q7YRF5 


Q7yrf5 canis famil 


45 


140 


8. 


9 


341 


11 


Q8BLK3 


Q8blk3 mus musculu 



7VLIGNMENTS 



RESULT 1 
Q9ESA5 

ID Q9ESA5 PRELIMIN7VRY; PRT; 782 AA. 

AC Q9ESA5; 

DT Ol-MAR-2001 (TrEMBLrel. 16, Created) 

DT Ol-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT Ol-OCT-2003 (TrEMBLrel. 25, Last annotation update) 



DE Glial growth factor beta la (Fragment) . 

GN NRGl . 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Sprague-Dawley; TISSUE=Spinal cord^. and Brain stem; 

RA Carroll S.L,, Stonecypher M.S., Anderson K.D., Pearson R.J. Jr., 

RA Frohnert P.W. ; 

RT "Structural and Functional Diversity of Glial Growth Factor Isoforms 

RT Expressed in Regenerating Peripheral Nerve and Associated Neurons."; 

RL Submitted (OCT-1999) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AF194993; AAG28433.1; 

DR HSSP; Q12784; IHRE. 

DR GO; GO:0005102; F: receptor binding; TEA. 

DR GO; GO: 0005351; F: sugar porter activity; lEA. 

DR GO; GO: 0009790; P:embryonic development; lEA. 

DR GO; GO: 0009401; P : phosphoenolpyruvate-dependent sugar phospho. . .; lEA. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR002114; HPr_SerP__S. 

DR InterPro; IPR006210; lEGF. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR InterPro; IPR002154; Neuregulin- 

DR Pfam; PF00008; EGF; 1. 

DR Pfam; PF00047; ig; 1. 

DR Pfam; PF02158; Neuregulin; 1. 

DR PRINTS; PR01089; NEUREGULIN. 

DR SMART; SM00181; EGF; 1. 

DR SMART; SM00408; IGc2; 1. 

DR PROSITE; PS00022; EGF_1; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

DR PROSITE; PS00589; PTS_HPR_SER; 1. 

KW EGF-like domain; Immunoglobulin domain. 

FT NON_TER 1 1 

SQ SEQUENCE 782 AA; 86036 MW; F6174A68F4E27BDE CRC64; 

Query Match 32.1%; Score 506; DB 11; Length 782; 

Best Local Similarity 32.5%; Pred. No. 1.9e-36; 

Matches 121; Conservative 60; Mismatches 89; Indels 102; Gaps 12; 

Qy 23 PSLKSVQDQAYKAPVWEGKV QG 4 5 

II: III: I : I I I : I I I I II 
Db 1 PSVGSVQELARRAAWIEGKVHPPRRQQGALDRKAAGEAGAGARDQPVQDSPPSQDPLPA 60 

Qy 4 6 LVPAGGSSSNSTREPPASGRVALVKVLDKWPLRSGGLQREQVISV GS 92 

: I II : : : I I II I | : : : | I I : : : : : : I I 

Db 61 VNWTLPTGGPEPST — DQPGDPAPYLVKVHQVWAVKAGGLKKDSLLTVRLDTWGHPAFPS 118 

Qy 93 CVPLERNQRYIFFLEPT EQPLVFKTAFAPLDTNGKNLKKEVGKILCTDCATRPKL 147 

I I : : I II I I : I I I I : : I I I : I I : I I I I M : : I I II I : I 

Db 119 CGRLKEDSRYIFFMEPDANSSGRAPPAFRASFPPLET-GRNLKKEVSRVLCKRCALPPRL 177 

Qy 148 KKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS RDIRIKYGNGRKNSRL 204 

I : I I I I II I : II :: : : II I : I I I M : I : I = I : I I 



Db 178 KEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNKPENIKIQKKPGK— SEL 235 

Qy 205 QFNKVKVEDAGEYVCEAENILGKDTVRGRL YV 236 

: I I : I : I 1 I : I : : I I I : : I I 

Db 236 RINKASLADSGEYMCKVISKLGNDSAST^NITIVESNEFITGMPASTETAYVSSESPIRIS 2 95 

Qy 237 NSVSTTLSSWSG— HARKCNETAKSYCVNGGVCYYIEGINQLS CKCPVGYT 2 85 

I : t : I : I : I I III | : : | I M I I : : : : : I WW W 
Db 296 VSTEGANTSSSTSTSTTGTSHLIKCAEKEKTFCVNGGECFTVKDLSNPSRYLCKCPNEFT 355 

Qy 2 86 GDRCQQFAMVNF 297 

I I I II : I : I 
Db 356 GDRCQNYVMASF 367 



RESULT 2 
Q9ESA1 

ID Q9ESA1 PRELIMINARY; PRT; 342 AA. 

AC Q9ESA1; 

DT Ol-MAR-2001 (TrEMBLrel. 16, Created) 

DT Ol-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT Ol-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Glial growth factor GGF beta 4 (Fragment) . 

GN NRGl . 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus, 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A, 

RC STRAIN=Sprague-Dawley; 

RC TISSUE=Axotomized lumbar dorsal root ganglion/spinal cord; 

RA Carroll S.L., Stonecypher M.S., Anderson K.D., Pearson R.J. Jr., 

RA Frohnert P.W. ; 

RT "Structural and Functional Diversity of Glial Growth Factor Isoforms 

RT Expressed in Regenerating Peripheral Nerve and Associated Neurons."; 

RL Submitted (OCT-1999) to the EMBL/GenBank/DDB J databases. 

DR EMBL; AF194997; AAG28451.1; -. 

DR HSSP; Q12784; IHRE. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; lEGF. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR Pfam; PF00008; EGF; 1. 

DR Pfam; PF00047; ig; 1. 

DR SMART; SM00181; EGF; 1. 

DR SMART; SM00408; IGc2; 1. 

DR PROSITE; PS00022; EGF_1; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

KW EGF-like domain; Immunoglobulin domain. 

FT NON_TER 1 1 

FT NON_TER 342 342 

SQ SEQUENCE 342 AA; 37836 MW; 8BE3 6FC83 6553124 CRC64; 



Query Match 30.2%; Score 476; DB 11; Length 342; 

Best Local Similarity 34.3%; Pred. No. 3.1e-34; 

Matches 107; Conservative 56; Mismatches 87; Indels 62; Gaps 10; 



Qy 43 VQGLVPAGGSSSNSTREPPASGRVALVKVLDKWPLRSGGLQREQVISV GS 92 

I : I I I : : : I I I I I | : : : M I : : : : = : I I 

Db 5 VNWTLPTGGPEPST — DQPGDPAPYLVKVHQVWAVKAGGLKKDSLLTVRLDTWGHPAFPS 62 

Qy 93 CVPLERNQRYIFFLEPT EQPLVFKTAFAPLDTNGKNLKKEVGKILCTDCATRPKL 147 

I I : : I I I I I : I I I I : : I I I : I I : M I I I I : : I I II I'l 

Db 63 CGRLKEDSRYIFFMEPDANSSGRAPPAFRASFPPLET-GRNLKKEVSRVLCKRCALPPRL 121 

Qy 14 8 KKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS RDIRIKYGNGRKNSRL 204 

I : I I I I II I : I I : : : : I I I : I I I I I : I : I : I : I I 

Db 122 KEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNKPENIKIQKKPGK — SEL 179 

Qy 205 QFNKVKVEDAGEYVCEAENILGKDTVRGRL YV 236 

: I I : I : I I I : I : : II I : : II 

Db 180 RINKASLADSGEYMCKVISKLGNDSASANITIVESNEFITGMPASTETAYVSSESPIRIS 239 

Qy 237 NSVSTTLSSWSG — HARKCNETAKSYCVNGGVCYYIEGINQLS CKCPVGYT 285 

I : I : I : I : I I III I : : M I I I I : : : : : I I I II : I 
Db 24 0 VSTEGANTSSSTSTSTTGTSHLIKC7VEKEKTFCVNGGECFTVKDLSNPSRYLCKCPNEFT 299 

Qy 28 6 GDRCQQFAMVNF 297 

I I I I I : I : I 
Db 300 GDRCQNYVMASF 311 



RESULT 3 
Q9ESA2 

ID Q9ESA2 PRELIMINARY; PRT; 323 AA. 

AC Q9ESA2; 

DT Ol-MAR-2001 (TrEMBLrel. 16, Created) 

DT Ol-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT Ol-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Glial growth factor GGF beta 3 (Fragment) . 

GN NRGl . 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Sprague-Dawley; 

RA Carroll S.L,, Stonecypher M.S., Anderson K.D., Pearson R.J. Jr., 

RA Frohnert P . W, ; 

RT "Structural and Functional Diversity of Glial Growth Factor Isoforms 

RT Expressed in Regenerating Peripheral Nerve and Associated Neurons-"; 

RL Submitted (OCT-1999) to the EMBL/GenBank/DDBJ databases, 

DR EMBL; AF194996; AAG28450.1; 

DR HSSP; Q12784; IHRE. 

DR InterPro; IPR006209; EGF_like, 

DR InterPro; IPR006210; lEGF. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2 . 

DR Pfam; PF00008; EGF; 1. 

DR Pfam; PF00047; ig; 1. 

DR SMART; SM00181; EGF; 1. 

DR SMART; SM00408; IGc2; 1. 



DR PROSITE; PS00022; EGF_1; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

KW EGF-like domain; Immunoglobulin domain. 

FT NON_TER 1 1 

SQ SEQUENCE 323 AA; 35358 MW; C7DF153A93 9A8 0C8 CRC64; 

Query Match 30.1%; Score 473; DB 11; Length 323; 

Best Local Similarity 34.3%; Pred. No. 5.4e-34; 

Matches 107; Conservative 55; Mismatches 88; Indels 62; Gaps 10; 

Qy 43 VQGLVPAGGSSSNSTREPPASGRVALVKVLDKWPLRSGGLQREQVISV GS 92 

I : I I I : : : I I I I I | ::: | | | :::::: | I 

Db 5 VNWTLPTGGPEPST — DQPGDPAPYLVKVHQVWAVKAGGLKKDSLLTVRLDTWGHPAFPS 62 

Qy 93 CVPLERNQRYIFFLEPT EQPLVFKTAFAPLDTNGKNLKKEVGKILCTDCATRPKL 147 

I I : : I I I I I : I I I I : : I I I : I I : I I I I I I : : I I I I I : I 

Db 63 CGRLKEDSRYIFFMEPDANSSGRAPPAFRASFPPLET-GRNLKKEVSRVLCKRCALPPRL 121 

Qy 148 KKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS RDIRIKYGNGRKNSRL 204 

I : I M I II I : M : : : : I I I : I I I I I : I : I : I : .1 I 

Db 122 KEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNKPENIKIQKKPGK~SEL 179 

Qy 205 QFNKVKVEDAGEYVCEAENILGKDTVRGRL YV 236 

: I 1 I : I I I : I : : I I I : : II 

Db 180 RINKASPADSGEYMCKVISKLGNDSASANITIVESNEFITGMPASTETAYVSSESPIRIS 239 

Qy 237 NSVSTTLSSWSG — HARKCNETAKSYCVNGGVCYYIEGINQLS CKCPVGYT 285 

I : I : I : I : I I III | : : | I I I I I : : : : : I I I I I : I 
Db 240 VSTEGANTSSSTSTSTTGTSHLIKCAEKEKTFCVNGGECFTVKDLSNPSRYLCKCPNEFT 299 

Qy 286 GDRCQQFAMVNF 297 

I I I I I : I : I 
Db 300 GDRCQNYVMASF 311 



RESULT 4 
Q9ESA3 

ID Q9ESA3 PRELIMINARY; PRT; 317 AA. 

AC Q9ESA3; 

DT Ol-MAR-2001 (TrEMBLrel. 16, Created) 

DT Ol-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT Ol-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Glial growth factor GGF beta 2 (Fragment) . 

GN NRGl . 

OS Rattus norvegicus (Rat) , 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Sprague~Dawley; 

RC TISSUE=Axotomized lumbar dorsal root ganglion/spinal cord; 

RA Carroll S.L., Stonecypher M.S., Anderson K.D,, Pearson R.J. Jr., 

RA Frohnert P.W. ; 

RT "Structural and Functional Diversity of Glial Growth Factor Isoforms 

RT Expressed in Regenerating Peripheral Nerve and Associated Neurons."; 

RL Submitted (OCT-1999) to the EMBL/GenBank/DDBJ databases. 



DR EMBL; AF194995; AAG28449.1; -. 

DR HSSP; Q12784; IHRE. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR000886; ER_target_S , 

DR InterPro; IPR006210; lEGF. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2, 

DR Pfam; PF00008; EGF; 1. 

DR Pfam; PF00047; ig; 1. 

DR SMART; SM00181; EGF; 1. 

DR SMART; SM00408; IGc2; 1. 

DR PROSITE; PS00022; EGF_1; 1. 

DR PROSITE; PS00014; ER_TARGET; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

KW EGF-like domain; Immunoglobulin domain. 

FT NON_TER 1 1 

FT NON_TER 317 317 

SQ SEQUENCE 317 AA; 34785 MW; 4487FA3E9CD876B9 CRC64; 

Query Match 29.9%; Score 471; DB 11; Length 317; 

Best Local Similarity 34.0%; Pred. No. 7.9e-34; 

Matches 106; Conservative 57; Mismatches 87; Indels 62; Gaps 10; 

Qy 43 VQGLVPAGGSSSNSTREPPASGRVALVKVLDKWPLRSGGLQREQVISV GS 92 

t :| tl :: : I MM I ::MM::: ::M I 

Db 5 VNWTLPTGGPEPST — DQPGDPAPYLVKVHQVWAVKAGGLKKDSLLTVRLDTWGHPAFPS 62 

Qy 93 CVPLERNQRYIFFLEPT EQPLVFKTAFAPLDTNGKNLKKEVGKILCTDCATRPKL 147 

I M : M M M M I h M M M h M M M : M I M Ml 

Db 63 CGRLKEDSRYIFFMEPDANSSGRAPPAFRASFPPLET-GRDLKKEVSRVLCKRCALPPRL 121 

Qy 148 KKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS RDIRIKYGNGRKNSRL 2 04 

M M M II MM:: : : I I M I I M I : M M I : II 

Db 122 KEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNKPENIKIQKKPGK— SEL 179 

Qy 205 QFNKVKVEDAGEYVCEAENILGKDTVRGRL YV 236 

: I I : M I I M I : : I I M : II 

Db 180 RINKASLADSGEYMCKVISKLGNDSASANITIVESNEFITGMPASTETAYVSSESPIRIS 239 

Qy 237 NSVSTTLSSWSG— HARKCNETAKSYCVNGGVCYYIEGINQLS CKCPVGYT 2 85 

M M I : I : I I III I : : I II II M : : : : I II I I : I 
Db 240 VSTEGANTSSSTSTSTTGTSHLIKCAEKEKTFCVNGGECFTVKDLSNPSRYLCKCPNEFT 299 

Qy 28 6 GDRCQQFAMVNF 297 

MM! : I : I 
Db 300 GDRCQNYVMASF 311 



RESULT 5 
Q8NFN3 

ID Q8NFN3 PRELIMINARY; PRT; 348 AA. 

AC Q8NFN3; 

DT Ol-OCT-2002 (TrEMBLrel. 22, Created) 

DT Ol-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT Ol-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Neuregulin 1 isoform GGF2 (Fragment) . 

GN NRGl . 



OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Stefansson H., Sigurdsson E., Steinthorsdottir V., Bjornsdottir S., 

RA Sigmundsson T., Ghosh S., Brynjolfsson J., Gunnarsdottir S.^ 

RA Ivarsson O., Chou T.T., Hjaltason O., Birgisdottir B,, Jonsson H., 

RA Gudnadottir V.G., Gudmundsdottir E. , Bjornsson A., Ingvarsson B., 

RA Ingason A., Sigfusson S., Hardardottir H., Harvey R.P., Brunner D., 

RA Mutel v., Gonzalo A., Lemke G.,. Sainz J., Johannesson G., 

RA Andresson T., Gudbjartsson D., Manolescu A., Frigge M.L,, Gurney M.E., 

RA Kong A., Gulcher J.R., Petursson H. , Stefansson K. ; 

RT "Neuregulin 1 and susceptibility to Schizophrenia."; 

RL Submitted (MAR-2002) to the EMBL/GenBank/DDB J databases. 

DR EMBL; AF491780; AAM71140.1; -. 

DR InterPro; IPR003599; Ig. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR Pfam; PF00047; ig; 1. 

DR SMART; SM00409; IG; 1. 

DR SMART; SM00408; IGc2; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

KW Immunoglobulin domain. 

FT NON_TER 348 348 

SQ SEQUENCE 348 AA; 36997 MW; 15568C62 60C5635C CRC64; 

Query Match 25.7%; Score 404.5; DB 4; Length 348; 

Best Local Similarity 34.4%; Pred. No. 8.1e-28; 

Matches 100; Conservative 49; Mismatches 69; Indels 73; Gaps 11; 

GVSLACYS — PSLKSVQDQAYKAPVWEGKV QGLV PAGGSSS — NSTRE 59 

I I : I I I I I : I I I : I : I I I : I I I I I I : I I : II 

GASV-CYSSPPSVGSVQELAQRAAWIEGKVHPQRRQQGALDRKAAAAAGEAGAWGGDRE 116 

PPASGRVA LVKVLDKWPLRSGGLQ 83 

111:11 I I I I I :: : I I I : 

PPAAGPRALGPPAEEPLLAANGTVPSWPTAPVPSAGEPGEEAPYLVKVHQVWAVKAGGLK 176 

REQVISV GSCVPLERNQRYIFFLEP TEQPLVFKTAFAPLDTNGKN 128 

: : : : : I II I : : I I I I I : I I : I I : : I I I : I I : I 



II I I :: I I II I : I I : I I I I II I : I I : : : : | I I : I II I I 



-RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYV 236 
: : I : I : I : I I : I I : I : I I I : I : : I I I : : : 



Qy 


15 


Db 


58 


Qy 


60 


Db 


117 


Qy 


84 


Db 


177 


Qy 


129 


Db 


236 


Qy 


189 


Db 


296 



RESULT 6 
Q07112 

ID Q07112 PRELIMINARY; PRT; 241 AA. 

AC Q07112; 



DT Ol-JAN-1998 (TrEMBLrel. 05, Created) 

DT Ol-JAN-1998 (TrEMBLrel. 05, Last sequence update) 

DT Ol-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Glial growth factor. 

GN GGFBPP5. 

OS Bos taurus (Bovine) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Cetartiodactyla ; Ruminantia; Pecora; Bovoidea; 

OC Bovidae; Bovinae; Bos. 

OX NCBI_TaxID=9913 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Posterior pituitary; 

RX MEDLINE=932 05115; PubMed=8 0 96067 ; 

RA Marchionni M.A., Goodearl A.D.G., Chen M. , Bermingham-McDonogh O., 

RA Kirk C, Hendricks M. , Danehy F., Misumi D., Sudhalter J., 

RA Kobayashi K., Wroblewski D. , Lynch C, Baldasarre M., Hiles I., 

RA Davis J.B., Hsuan J., Totty N.F., Otsu M. , McBurney R.N., 

RA Waterfield M.D. , Stroobant P., Gwynne D.; 

RT "Glial growth factors are alternatively spliced erbB2 ligands 

RT expressed in the nervous system."; 

RL Nature 362:312-318(1993). 

DR EMBL; L12259; A7\A30540.1; -. 

DR PIR; S32359; S32359. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; lEGF. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR00359B; Ig_c2. 

DR Pfam; PF00008; EGF; 1. 

DR Pfam; PF00047; ig; 1. 

DR SMART; SM00181; EGF; 1. 

DR SMART; SM00408; IGc2; 1. 

DR PROSITE; PS00022; EGF_1; 1. 

DR PROSITE; PS01186; EGF_2; FALSE_NEG. 

DR PROSITE; PS50835; IG_LIKE; 1. 

SQ SEQUENCE 241 AA; 25955 MW; BF571297E8DA9796 CRC64; 



Query Match 18.6%; Score 292; DB 6; Length 241; 

Best Local Similarity 32.8%; Pred. No. 5.9e-18; 

Matches 65; Conservative 37; Mismatches 52; Indels 44; Gaps 6; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKEL NRSRDIRIKYGNG 198 

I I : M : I I I I II I : II : : : : I I I : I II I : : : I : I : I 

Db 34 ALPPRLKEMKSQESVAGSKLVLRCETSSEYSSLKFKWFKNGSELSRKNKPQNIKIQKRPG 93 



Qy 199 RKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRL YV 236 

: I I : : I : I : I I I : I : : I I I : : II 

Db 94 K — SELRISKASLADSGEYMCKVISKLGNDSASANITIVESNEITTGMPASTETAYVSSE 151 

Qy 237 NSVSTTLSSWSG— HARKCNETAKSYCVNGGVCYYIEGINQLS CK 279 

I : I : I : I : I I II I I : : I I I I I I : : : : : I II 
Db 152 SPIRISVSTEGTNTSSSTSTSTAGTSHLVKCAEKEKTFCVNGGECFMVKDLSNPSRYLCK 211 

Qy 2 80 CPVGYTGDRCQQF7\MVNF 2 97 

II : I I I I I I : I : I 
Db 212 CPNEFTGDRCQNYVMASF 229 



RESULT 7 
035947 



ID 035947 PRELIMINARY; PRT; 461 AA. 

AC 035947; 

DT Ol-JAN-1998 (TrEMBLrel. 05, Created) 

DT Ol-JAN-1998 (TrEMBLrel. 05, Last sequence update) 

DT Ol-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Pro-neuregulin-1, isoform alpha 2B precursor. 

GN NRGl OR NDF. 

OS Mesocricetus auratus (Golden hamster) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Cricetinae; 

OC Mesocricetus. 

OX NCBI_Tax I D= 1 0 0 3 6 ; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORM ALPHA2B) , AND SEQUENCE OF 64-81. 

RC TI S SUE=EMBRYO ; 

RX MEDLINE=98196996; PubMed=9537646; 

RA Velasco J. A., Feijoo E,, Avila M.A. , Notario V.; 

RT "Secretion of neu differentiation factor-like polypeptides by cph- 

RT transformed fibroblasts: cloning and characterization of Syrian 

RT hamster neuregulin cDNAs . " ; 

RL Mol. Carcinog. 21:156-163(1998). 

CC -!- FUNCTION: DIRECT LIGAND FOR ERBB3 AND ERBB4 TYROSINE KINASE 

CC RECEPTORS. CONCOMITANTLY RECRUITS ERBBl 7\ND ERBB2 CORECEPTORS, 

CC RESULTING IN LIGAND- STIMULATED TYROSINE PHOSPHORYLATION AND 

CC ACTIVATION OF THE ERBB RECEPTORS. MAY PLAY AN IMPORTANT ROLE IN 

CC PROVIDING GROWTH ADVANTAGE IN NEOPLASTIC CELLS. 

CC -!- SUBUNIT: THE CYTOPLASMIC DOMAIN INTERACTS WITH THE LIM DOMAIN 

CC REGION OF LIMKl (BY SIMILARITY) . 

CC -!- SUBCELLULAR LOCATION: EXISTS AS TYPE I MEMBRANE PROTEIN AND AS A 

CC PROTEOLYTICALLY RELEASED SOLUBLE GROWTH FACTOR FORM. THE MEMBRANE- 

CC BOUND FORM DOES NOT SEEM TO BE ACTIVE (BY SIMILARITY) . 

CC -!- TISSUE SPECIFICITY: EXPRESSED AT HIGHER LEVEL AFTER NEOPLASMIC 

CC TRANSFORMATION OF CELLS. 

CC -!- DOMAIN: THE CYTOPLASMIC DOMAIN MAY BE INVOLVED IN THE REGULATION 

CC OF TRAFFICKING AND PROTEOLYTIC PROCESSING. REGULATION OF THE 

CC PROTEOLYTIC PROCESSING INVOLVES INITIAL INTRACELLULAR DOMAIN 

CC DIMERIZATION (BY SIMILARITY) . 

CC -!- DOMAIN: ERBB RECEPTOR BINDING IS ELICITED ENTIRELY BY THE EGF-LIKE 

CC DOMAIN. 

CC -!- PTM: PROTEOLYTIC CLEAVAGE CLOSE TO THE PLASMA MEMBRANE ON THE 

CC EXTERNAL FACE LEADS TO THE RELEASE OF THE SOLUBLE GROWTH FACTOR 

CC FORM (BY SIMILARITY) . 

CC -!- PTM: EXTENSIVE GLYCOSYLATION PRECEDES PROTEOLYTIC CLEAVAGE (BY 

CC SIMILARITY) . 

CC -!- SIMILARITY: CONTAINS 1 EGF-LIKE DOMAIN. 

CC SIMILARITY: CONTAINS 1 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAIN. 

CC -!- SIMILARITY: BELONGS TO THE NEUREGULIN FAMILY. 

DR EMBL; U96612; AAB71812.1; 

DR HSSP; Q12784; IHRE. 

DR GO; GO: 0016021; C:integral to membrane; lEA. 

DR GO; GO: 0008083; F:growth factor activity; lEA. 

DR GO; GO: 0009790; P:embryonic development; lEA. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; lEGF. 



Urs. 


InterPro; 


IPR007110; Ig-like. 






JJK 


InterPro; 


IPR00359 


8; Ig_c2. 








UK 


InterPro; 


IPR002154; Neuregulin. 




UK 


Pfam; PF00008; EGF; 1. 








UK 


Pfam; PF00047; ig; 


1. 








UK 


Pfam; PF02158; Neuregulin; 


1. 






UK 


PRINTS; PR01089; NEUREGULIN, 






UK 


SMART; SM00181; EGF; 1. 








UK 


SMART; SM00408; IGc2; 1. 








DR 


PROSITE; 


PS00022; 


EGF_1; 1. 








DR 


PROSITE; 


PS01186; 


EGF_2 ; 1 . 








DR 


PROSITE; 


PS50835; 


IG LIKE; 


1. 






KW 


Growth factor; EGF 


-like domain; Immunoglobulin domain; Glycoprotein 


KW 


Transmembrane; Alternative 


splicing . 




FT 


PROPEP 


1 


13 




BY SIMILARITY. 




FT 


CHAIN 


14 


461 




PRO-NEUREGULIN-1, MEMBRANE- BOUND FORM. 


FT 


CHAIN 


14 


241 




NEUREGULIN- 1. 




FT 


DOMAIN 


14 


242 




EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


243 


265 




INTERNAL SIGNAL SEQUENCE (POTENTIAL). 


FT 


DOMAIN 


266 


461 




CYTOPLASMIC ( POTENTIAL ) 




FT 


DOMAIN 


50 


119 




IG-LIKE C2-TYPE DOMAIN. 




FT 


DOMAIN 


165 


177 




SER/THR-RICH. 




FT 


DOMAIN 


178 


222 




EGF-LIKE. 




FT 


DISULFID 


57 


112 




BY SIMILARITY. 




FT 


DISULFID 


182 


196 




BY SIMILARITY. 




FT 


DISULFID 


190 


210 




BY SIMILARITY. 




FT 


DISULFID 


212 


221 




BY SIMILARITY. 




FT 


C7VRB0HYD 


73 


73 




N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


120 


120 




N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


126 


126 




N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


164 


164 




N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


SQ 


SEQUENCE 


461 AA; 


50890 MW; 


935C9560F7148336 CRC64; 



Query Match 17.6%; Score 277; DB 11; Length 461; 

Best Local Similarity 33.2%; Pred. No. 3.1e-16; 

Matches 63; Conservative 31; Mismatches 56; Indels 40; Gaps 5; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELN-RSRDIRIKYGNGRK 200 

I I : I I : I I I II I : I I : : I : : : I I I : I I I I I : : II 

Db 34 ALPPRLKEMKIQESAAGSKLVLRCETSSEYPELRFKWFKNGSELNKRTKPQNIKLQKKPG 93 

Qy 201 NSRLQFNKVKVEDAGEYVCEAENILGKDTVRG RLYV 236 

I I : I I : I : I M : I : : I I I : III 
Db 94 KSELRINKASLADSGEYMCKVISKLGNDSASANITIVDSNEFITGMPASTERAYVSSESP 153 

Qy 237 NSVSTTLSSWSG — HARKCNETAKSYCVNGGVCYYIEGINQLS CKCP 281 

I : I : I : I : I I Ml I : : I I II i I : : : : : I III 
Db 154 IRISVSTEGANTSSSTSTSTTGTSHLVKCAEKEKTFCVNGGECFMVKDLSNPSRYLCKCQ 213 

Qy 282 VGYTGDRCQQ 2 91 

I : I I II : 
Db 214 PGFTGARCTE 223 



RESULT 8 
Q810X1 

ID Q810X1 PRELIMINARY; PRT; 54 AA. 



AC 


Oft 1 ny 1 " 




DT 






DT 


n 1 _,TTTW— ? D D f TrF.MRT.re^l 94 T;:5<=-h ctf^mifanr'pa nT^H;:^ 1-*:^ ^ 






u^^i zuuo ^ 1 i ijFii5ijire_L , zio^ ija.sc annoca.cion u.pua.L-ej 






1 1 y ^ m iT "in 9 — V^^"t~.3 /TT''Ka rrm ^ 4- \ 




OS 


Ml 1 e; mi 1 c; ^'^ l "1 1 1 cr ( Moi 1 Q Ci ^ 




or 


IjUKci JL y U Uct / Flc U a Z. <J d / ^flOl.adl-d/' Ul.dmdl.d / VeiTuGDrdtd/ 


rjuuexeosu omx , 


oc 


I'ldilUlLdX J. d / rLiUL-ilci.Xd^ r\(JU.enL.la. / o 0X111.0 y lid CIlx / l^ilHTJ-CldtS/ 


i^iurxnae / i^ius . 


ny 


MPRT T;:^ vTD— 1 OnQD • 

IdAXU — X VJ U -7 U / 






[1] 






SEQUENCE FROM N.A. 






STRAIN=CD-1; TISSUE-01 factory bulb; 




p a 


Mautino B., Dalla Costa L., Dati C; 




PT" 


"Bioactive recombinant NRGl^. NRG2 and NRG3 expressed 


in E. coll."; 


PT 
r\J_i 


Submitted (JAN-2003) to the EMBL/GenBank/DDB J databases. 


TiP 


EMBL; AY227026; AA072523.1; -. 




HP 


InterPro; IPR006209; EGF like. 




"HP 


Pfam; PF00008; EGF; 1. 




DR 


PROSITE; PS00022; EGF_1; 1. 




DR 


PROSITE; PS01186; EGF 2; 1. 




FT 


NON_TER 1 1 




FT 


NON TER 54 54 




SQ 


SEQUENCE 54 AA; 6019 MW; C25AA17A4D0BA59A CRC64; 





Query Match 17.1%; Score 269; DB 11; Length 54; 

Best Local Similarity 100.0%; Pred. No. 9.2e-17; 

Matches 47; Conservative 0; Mismatches 0; Indels 0; Gap 

Qy 252 KCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRCQQFAMVNFS 298 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 KCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRCQQFAMVNFS 47 



RESULT 9 




Q8BKI8 




ID 


Q8BKI8 PRELIMINARY; PRT; 211 AA. 




AC 


Q8BKI8; 




DT 


Ol-MAR-2003 (TrEMBLrel. 23, Created) 




DT 


Ol-MAR-2003 (TrEMBLrel. 23, Last sequence update) 




DT 


Ol-OCT-2003 (TrEMBLrel. 25, Last annotation update) 




DE 


NEUREGULIN-1. 




GN 


NRGl. 




OS 


Mus musculus (Mouse) . 




OC 


Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 


Euteleos tomi ; 


OC 


Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; 


Murinae; Mus. 


OX 


NCBI TaxID=10090; 




RN 


[1] 




RP 


SEQUENCE FROM N.A. 




RC 


STRAIN=C57BL/6J; TISSUE=Eye; 




RX 


MEDLINE=22354683; PubMed=124 668 5 1 ; 




RA 


The FANTOM Consortium, 




RA 


the RIKEN Genome Exploration Research Group Phase I 


& II Team; 


RT 


"Analysis of the mouse transcriptome based on functional annotation 


RT 


60,770 full-length cDNAs . " ; 




RL 


Nature 42 0:563-573(2002). 




DR 


EMBL; AK051824; BAC34784.1; 




DR 


MGD; MGI: 96083; Nrgl. 







GO, 


■ GO : 




■ C : 


cytoplasm; IDA. 


DR 


GO, 


■ GO : 


r\ r\ />, r~ a d '~l 

0005887 , 


' C : 


integral to plasma membrane; IDA, 


UK 


GO, 


• GO : 




; C : 


synaptic junction; IDA. 


DR 


GO, 


■ GO: 


0005176 , 


; F: 


Neu/ErDD-ii receptor Dinamg, iua. 


DR 


GO, 


• GO: 


0016477 , 


' P : 


cell migration; IGI . 


DK 


GO, 


■ GO : 


A A A n O A o 

uuuuy uz , 


; P : 


cellular morphogenesis; IDA. 


DR 


GO, 


; GO: 


A A 1 A A A 1 

0010001 , 


; P : 


glial cell differentiation; IMP. 


DR 


GO, 


; GO: 


A A A T C A T 

0007507 , 


; P : 


heart development; IDA. 


DR 


GO, 


; GO : 


A A A T A ii' 

000 /bZb , 


; P: 


locomotory behavior; IMP. 


DR 


GO, 


; GO : 


A A A A 1 £r C 

OOOUloo , 


; P: 


MAPKKK cascade; IDA. 


DR 


GO 


; GO : 


A A A ""I C 1 

0007517 t 


• P: 


muscle development; IMP. 


DR 


GO, 


; GO: 


/"» A yl A A C C 

0042055 , 


; P: 


neuronal lineage restriction; IMP, 


UK 


bU 


; CaU : 


A n /I R O 1 '5 


' P: 


neurotransmitter receptor metabolism; IMP. 


DR 


GO 


; GO: 


0007422. 


r P: peripheral nervous system development; IMP. 


DR 


GO 


; GO: 


0045860, 


r P: 


positive regulation of protein kinase activity; 


DR 


GO 


; GO: 


0046579, 


r P: 


positive regulation of RAS protein signal t. . 


DR 


GO 


; GO: 


0045595, 


; P: 


regulation of cell differentiation; IMP. 


DR 


GO 


; GO: 


0007416, 


; P: 


synaptogenesis ; IMP. 



DR InterPro; IPR003599; Ig. 

DR InterPro; IPR007110; Ig-like, 

DR InterPro; IPR003598; Ig_c2. 

DR Pfam; PF00047; ig; 1, 

DR SMART; SM00409; IG; 1. 

DR SMART; SM00408; IGc2; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

SQ SEQUENCE 211 AA; 22893 MW; 75D3674B988BE0D3 CRC64; 

Query Match 14.9%; Score 234; DB 11; Length 211; 

Best Local Similarity 31.7%; Pred. No. 7.7e-13; 

Matches 57; Conservative 32; Mismatches 47; Indels 44; Gaps 6; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNR SRDIRIKYGNG 198 

I I : I I : I I I I II I : II : : : : I I I : I I I I I : : : : I : I 

Db 34 ALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRRNKPQNVKIQKKPG 93 

Qy 199 RKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRG RLYV 236 

: I I : I I : I : I I I : I : : I I I : III 
Db 94 K — SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNDLTTGMSASTERPYVSSE 151 

Qy 237 NSVSTTLSSWSG— HARKCNETAKSYCVNGGVCYYIEGINQLS CK 279 

I : I : I : I : I I III I : : I I I I I I : : : : : I II 
Db 152 SPIRISVSTEGANTSSSTSTSTTGTSHLIKCAEKEKTFCVNGGECFMVKDLSNPSRYLCK 211 



RESULT 10 
Q9ESA4 

ID Q9ESA4 PRELIMINARY; PRT; 244 AA. 

AC Q9ESA4; 

DT Ol-MAR-2001 (TrEMBLrel. 16, Created) 

DT Ol-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT Ol-MAR-2001 (TrEMBLrel. 16, Last annotation update) 

DE Glial growth factor (Fragment) . 

GN NRGl , 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteieostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI TaxID=10116; 



RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Sprague-Dawley; 

RA Carroll S.L., Stonecypher M.S., Anderson K.D., Pearson R.J. Jr., 

RA Frohnert P.W. ; 

RT "Structural and Functional Diversity of Glial Growth Factor Isoforms 

RT Expressed in Regenerating Peripheral Nerve and Associated Neurons."; 

RL Submitted (OCT-1999) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AF194994; AAG28434.1; -. 

FT NON_TER 244 244 

SQ SEQUENCE 244 AA; 25866 MW; 019CBC2DFFF8F625 CRC64; 



Query Match 14.0%; Score 221; DB 11; Length 244; 

Best Local Similarity 29.5%; Pred. No. 1.4e-ll; 

Matches 62; Conservative 28; Mismatches 44; Indels 76; Gaps 8; 

Qy 5 PAPGFSMLLFGVSL ACYS — PSLKSVQDQAYKAPVWEGKV 43 

II : I I I : III II : I I I : I : I I I : I I I I 

Db 38 PPPLLLLLLLGTAALAPGAAAERAAPAGASVCYSSPPSVGSVQELARRAAWIEGKVHPP 97 



Qy 4 4 QG ' LVPAGGSSSNSTREPPASGRV 66 

I I :| I I :: : I 

Db 98 RRQQGALDRKAAGEAGAGARDQPVQDSPPSQDPLPAVNWTLPTGGPEPST — DQPGDPAP 155 

Qy 67 ALVKVLDKWPLRSGGLQREQVISV GSCVPLERNQRYIFFLEPT EQ 111 

I I I I I ::: I I I :::::: I II I : : I I || I : I I 

Db 156 YLVKVHQVWAVKAGGLKKDSLLTVRLDTWGHPAFPSCGRLKEDSRYIFFMEPDANSSGRA 215 



Qy 112 PLVFKTAFAPLDTNGKNLKKEVGKILCTDC 141 

I I : : I II : I I : I I I II I : : I I I 
Db 216 PPAFRASFPPLET-GRNLKKEVSRVLCKRC 244 



RESULT 11 
Q810X2 

ID Q810X2 PRELIMINARY; PRT; 79 AA. 

AC Q810X2; 

DT Ol-JUN-2003 (TrEMBLrel. 24, Created) 

DT Ol-JUN-2003 (TrEMBLrel. 24, Last sequence update) 

DT Ol-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Neuregulin 2-alpha (Fragment) . 

OS Mus mus cuius (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=CD-1; TISSUE=01f actory bulb; 

RA Mautino B., Dalla Costa L., Dati C. ; 

RT "Bioactive recombinant NRGl, NRG2 and NRG3 expressed in E. coll."; 

RL Submitted (JAN-2003) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AY227025; AA072522.1; 

DR InterPro; IPR006209; EGF_like. 

DR Pfam; PF00008; EGF; 1. 

DR PROSITE; PS00022; EGF__1; 1. 

DR PROSITE; PS01186; EGF__2 ; 1. 

FT NON TER 1 1 



SQ SEQUENCE 79 AA; 8727 MW; DA4501900C610780 CRC64; 



Query Match 12.7%; Score 200; DB 11; Length 79; 

Best Local Similarity 89.5%; Pred. No. 2.3e-10; 

Matches 34; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 252 KCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 289 

I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I : I II 
Db 1 KCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 38 



RESULT 12 
Q8NFN2 

ID Q8NFN2 PRELIMINARY; PRT; 167 AA. 

AC Q8NFN2; 

DT Ol-OCT-2002 (TrEMBLrel. 22, Created) 

DT Ol-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT Ol-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Neuregulin 1 isoform GGF (Fragment) . 

GN NRGl . 

OS Homo sapiens (Human) , 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo, 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Stefansson H., Sigurdsson E., Steinthorsdottir V., Bjornsdottir S., 

RA Sigmundsson T,, Ghosh S>, Brynjolfsson J., Gunnarsdottir S., 

RA Ivarsson O., Chou T.T., Hjaltason O., Birgisdottir B., Jonsson H., 

RA Gudnadottir V.G., Gudmundsdottir E., Bjornsson A., Ingvarsson B., 

RA Ingason A., Sigfusson S., Hardardottir H., Harvey R.P., Brunner D., 

RA Mutel v., Gonzalo A., Lemke G., Sainz J., Johannesson G., 

RA Andresson T., Gudbjartsson D., Manolescu A., Frigge M.L., Gurney M.E., 

RA Kong A., Gulcher J.R., Petursson H., Stefansson K. ; 

RT "Neuregulin 1 and susceptibility to Schizophrenia,"; 

RL Submitted (MAR-2002) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AF491780; 7U\M71139,1; -. 

DR InterPro; IPR003599; Ig. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2, 

DR Pfam; PF00047; ig; 1. 

DR SMART; SM00409; IG; 1. 

DR SMART; SM00408; IGc2; 1, 

DR PROSITE; PS50835; IG_LIKE; 1. 

KW Immunoglobulin domain. 

FT NON_TER 167 167 

SQ SEQUENCE 167 7\A; 17983 MW; 9C2FB3A57 9325FF4 CRC64; 

Query Match 11.5%; Score 180.5; DB 4; Length 167; 

Best Local Similarity 30,9%; Pred. No. 3.5e-08; 

Matches 51; Conservative 30; Mismatches 63; Indels 21; Gaps 5; 

Qy 12 6 GKNLKKEVGKILCTDCAT RPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRW 17 8 

M I I I I : I I : II : I I I I II I : II : : : : I 

Db 11 GKGKKKERGSGKKPES7\AGSQSPALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKW 7 0 



Qy 17 9 FKDGKELNRS RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLY 235 



M - I I I I I . . I . I . I . I I . II - I • I I I • I 

Db 71 FKNGNELNRKNKPQNIKIQKKPGK — SELRINKASLADSGEYMCKVISKLGNDSASANIT 12 8 

Qy 236 V NSVSTTLSSWSGJiARKCNETT^SYCVNGGVCYYIEGINQLS 277 

: I : I : : : I : I : I : I I I I 

Db 129 IVESNEIITGMPASTEGAYVSSESPIRISVS TEGANTSS 167 



RESULT 13 
Q86GD6 



ID 
AC 
DT 
DT 
DT 
DE 
GN 
OS 
OC 
OC 
OC 
OX 
RN 
RP 
RC 
RA 
RT 
RT 
RL 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 



PRT; 8625 AA, 
Created) 

Last sequence update) 
Last annotation update) 



GO; 
GO; 
GO; 
GO; 
GO; 
GO; 
GO; 
GO; 
GO; 



GO:0005743; 
GO: 0005524; 
GO: 0005488; 
GO:0004674; 
GO:0004713; 
GO:0003743; 
GO:0006468; 
GO: 0006413; 
GO: 0006810; 
InterPro 
InterPro 
InterPro 
InterPro 
InterPro 
InterPro 
InterPro 
InterPro 
InterPro 
InterPro 
InterPro 
InterPro 
InterPro 



f Kimura S . ; 
of crayfish claw 



lEA, 



Q86GD6 PRELIMINARY; 
Q86GD6; 

Ol-JUN-2003 (TrEMBLrel. 24, 
Ol-JUN-2003 (TrEMBLrel. 24, 
Ol-OCT-2003 (TrEMBLrel. 25, 
Pro j ectin . 
PRO J. 

Procambarus clarkii (Red swamp crayfish) . 
Eukaryota; Metazoa; Arthropoda; Crustacea; Malacostraca; 
Eumalacostraca; Eucarida; Decapoda; Pleocyemata; Astacidea; 
Astacoidea; Cambaridae; Procambarus . 
NCBI_TaxID-6728; 
[1] 

SEQUENCE FROM N.A. 
TISSUE=Muscle; 

Oshino T., Shimamura J., Fukuzawa A., Maruyama K 
"The entire cDNA sequences of projectin isoforms 
closer and flexor muscles and their localization 
J, Muscle Res. Cell. Motil. 0:0-0(2003), 
EMBL; AB055927; BAC66140.1; -. 

C; mitochondrial inner membrane; 
F:ATP binding; lEA. 
F:binding; lEA. 

F:protein serine/threonine kinase activity; lEA 
F:protein-tyrosine kinase activity; lEA. 
F: translation initiation factor activity; lEA. 
P:protein amino acid phosphorylation; lEA. 
P: translational initiation; lEA. 
P:transport; lEA. 
IPR003962; FnIII__subd. 
IPR003961; FN_III. 
IPR008957; FN_III-like. 
IPR003599; Ig. 
IPR007110; Ig-like. 
IPR003598; Ig_c2 . 
IPR003596; Ig_v. 
IPR001993; Mitoch_carrier . 
IPR000719; Prot_kinase. 
IPR002290; Ser_thr_pkinase . 
IPR008271; Ser_thr_pkin_AS. 
IPR001950; TIF_SUI1. 
IPR001245; Tyr_pkinase. 
Pfam; PF00041; fn3; 39. 
Pfam; PF00047; ig; 13. 
Pfam; PF00069; pkinase; 1. 
PRINTS; PR00014; FNTYPEIII. 
ProDom; PDOOOOOl; Prot kinase; 1. 



DR SMART; SM00060; FN3; 39. 

DR SMART; SM00409; IG; 36. 

DR SMART; SM00408; IGc2; 24. 

DR SMART; SM00406; IGv; 3. 

DR SMART; SM0 022 0; S_TKc; 1. 

DR SMART; SM00219; TyrKc; 1. 

DR PROSITE; PS50835; IG_LIKE; 24. 

DR PROSITE; PS00215; MITOCH__CARRIER; 2. 

DR PROSITE; PS00107; PROTEIN_KINASE_ATP ; 1. 

DR PROSITE; PS50011; PROTEIN_KINASE_DOM; 1. 

DR PROSITE; PS00108; PROTEIN_KINASE_ST ; 1. 

DR PROSITE; PS01118; SUIl_l; 1. 

SQ SEQUENCE 8625 AA; 962637 MW; 56B8E4C4FE0AFC90 CRC64; 



Query Match 10.5%; Score 166; DB 5; Length 8625; 

Best Local Similarity 24.2%; Pred. No. 0.00014; 

Matches 87; Conservative 39; Mismatches 130; Indels 104; Gaps 



16; 



Qy 2 RRDP APGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPA 49 

III 111:11 : I : II : I I M 

Db 8115 RRQPLTYKLWQEEGEGAPSFTFLLRPRVIQCH QTCKLLCCLAGKP VPT 8162 

Qy 50 GGSSSNST REPPASGRVT^VKVLDKWPLRSGGLQREQVISVG SCVPLER 98 

II I : I II : : : : : I I I : I : I Ml: 

Db 8163 VKWYKGSQELSKFDYSQSHADG-WTIEIVNCKPADSGKYRCVATNSLGTDETSCWIVE 8221 

Qy 99 NQRYIFF LEPTEQPLV FKTAFAPLDTNGK NLKKEVGKILCTDCATRP 145 

::||| III: : I :|: : I I II 

Db 8222 DRRYIETTIKDLPPPPTPAIRVDDTSSSSYFTSTHKDGRSSTSTKVEAASSSSTSSAAAS 8281 



Qy 14 6 KLK KMKSQTGQV GEKQS 162 

11:11 II 
Db 8282 GAKRTLKPYGKRQDSTGSTSRSRSATKELELPPDDSLMGPPGFSGELPKTLAIKDGEALC 8341 

Qy 163 LKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAE 222 

III I : I : I II I I I : I : I I : I I I : I I : I I I I I I I : I 

Db 8342 LKC-TVKGDPEPQVSWFKDGEPLSSSDIIDLKYRQGL— ASLTINEVFPEDEGLYVCKAT 8398 

Qy 223 NILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAK SYCVNGGVCYYIEGINQLSCK 279 

: I I : : I : : : : : M I : III: I I I I 

Db 8399 SSLGSAETKCKLSISPMEQQINGKSGRGDKLPRITQHLLSQEVPDGTAH TLSCK 8452 



RESULT 14 
Q8I0L3 

ID Q8I0L3 PRELIMINARY; PRT; 5175 AA. 

AC Q8I0L3; 

DT Ol-MAR-2003 (TrEMBLrel. 23, Created) 

DT Ol-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE C. elegans him-4 protein (corresponding sequence F15G9.4a). 

GN F15G9.4 OR HIM-4. 

OS Caenorhabditis elegans. 

OC Eukaryota; Metazoa; Nematoda; Chroma dorea; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peloderinae; Caenorhabditis. 

OX NCBI__TaxID=6239; 

RN [1] 



RP SEQUENCE FROM N.A. 

RA Sulston J. E, ; 

RL Submitted (DEC-1994) to the EMBL/GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=99069613; PubMed=9 851 916 ; 

RA none; 

RT "Genome sequence of the nematode C.elegans: A platform for 

RT investigating biology."; 

RL Science 282:2012-2018(1998). 

RN [3] 

RP SEQUENCE FROM N.A. 

RA Kershaw J, K. ; 

RL Submitted (DEC-1994) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; Z47068; CAA87335.1; -. 

DR EMBL; Z47 070; CAA87335.1; JOINED. 

DR EMBL; Z47070; CAA87344.1; 

DR EMBL; Z47068; CAA87344.1; JOINED. 

DR PIR; T20992; T20992. 

DR WormPep; F15G9.4a; CEi8595. 

DR GO; GO: 0016020; C:membrane; lEA. 

DR GO; GO: 0005509; F: calcium ion binding; lEA. 

DR GO; GO: 0005215; F: transporter activity; lEA. 

DR GO; GO: 0006810; P: transport; lEA. 

DR InterPro; IPR000152; Asx_hydroxyl_S . 

DR InterPro; IPR000515; BPD__transp. 

DR InterPro; IPR001881; EGF_Ca. 

DR InterPro; IPR006209; EGF__like. 

DR InterPro; IPR006210; lEGF. 

DR InterPro; IPR003599; Ig. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR InterPro; IPR003596; Ig_v. 

DR Pfam; PF00047; ig; 47. 

DR SMART; SM00181; EGF; 2. 

DR SMART; SM00179; EGF_CA; 2. 

DR SMART; SM00409; IG; 45. 

DR SMART; SM00408; IGc2; 47. 

DR SMART; SM00406; IGv; 12. 

DR PROSITE; PSOOOlO; ASX_HYDROXYL; 1. 

DR PROSITE; PS00402; BPD_TRANSP_INN_MEMBR; 1. 

DR PROSITE; PS01186; EGF_2; 1. 

DR PROSITE; PS01187; EGF_CA; 2. 

DR PROSITE; PS50835; IG_LIKE; 47. 

SQ SEQUENCE 5175 AA; 568471 MW; 4B2 5618 03BBC62A4 CRC64; 

Query Match 10.5%; Score 164.5; DB 5; Length 5175; 

Best Local Similarity 26.5%; Pred. No. 9.3e-05; 

Matches 62; Conservative 26; Mismatches 79; Indels 67; Gaps 10; 



Qy 4 9 AGGSSSNSTR EPPASGRV7VLVKVLDK WPLRS 79 

INI: I III: Ml: I : 

Db 595 AGGMSTRKMRLDIMEPPS VKVTPQDVYFNMREGVNLSCEAMGDPKPEVHWYFKG 648 



Qy 80 GGLQREQVISVGSCVPLERNQRYIFFL EPTEQPLVFKTAFAPLDTNGKNLKKEV 133 

I : II :::::: | j : I II I 
Db 649 RHLLNDYKYQVG QDSKFLYIRDATHHDEGTYECRAMSQAGQARDTTDLML 698 



Qy 134 GKILCTDCATRPKLK — KMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDI 191 

MM:: : I | : h : I : M III I I II : II : I : I 
Db 699 ATPPKVEIIQNKMMVGR-GDRVSFECKTIRGKPHPKIRWFKNGKDLIKPDDY 749 

Qy 192 RIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSS 245 

M I : I I II I I I I II : II I I I I I : I 

Db 750 -IKINEG QLHIMGAKDEDAGAYSCVGENMAGKDVQVANLSVGRVPTIIES 798 



RESULT 15 
076518 

ID 076518 PRELIMINARY; PRT; 5198 AA. 

AC 076518; Q10036; 

DT Ol-NOV-1998 (TrEMBLrel. 08, Created) 

DT Ol-NOV-1998 (TrEMBLrel. 08, Last sequence update) 

DT Ol-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Hemicentin precursor (C. elegans him-4 protein) (corresponding 

DE sequence F15G9.4b). 

GN F15G9.4 OR HIM-4 . 

OS Caenorhabditis elegans. 

OC Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea ; 

OC Rhabditidae; Peloderinae; Caenorhabditis. 

OX NCBI_TaxID=6239; 

RN [1] 

RP SEQUENCE FROM N,A. 

RC STRAIN=Bristol N2 ; 

RX PubMed=11222143; 

RA Vogel B.E., Hedgecock E.M. ; 

RT "Hemicentin, a conserved extracellular member of the immunoglobulin 

RT superfamily, organizes epithelial and other cell attachments into 

RT oriented line-shaped junctions."; 

RL Development 128:883-894(2001). 

RN [2] 

RP SEQUENCE FROM N.A. 

RA Sulston J. E. ; 

RL Submitted (DEC-1994) to the EMBL/GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=99069613; PubMed=98 51916; 

RA none; 

RT "Genome sequence of the nematode C. elegans: A platform for 

RT investigating biology."; 

RL Science 282:2012-2018(1998). 

RN [4] 

RP SEQUENCE FROM N.A. 

RA Kershaw U.K.; 

RL Submitted (DEC-1994) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AF074901; AAC26792,1; -. 

DR EMBL; Z47068; CAA87336.1; -. 

DR EMBL; Z47070; CAA87336,1; JOINED. 

DR EMBL; Z47070; CAA87345.1; 

DR EMBL; Z47068; CAA87345.1; JOINED. 

DR PIR; T43290; T43290. 

DR HSSP; P00736; lAPQ. 

DR WormPep; Fl5G9.4b; CE18596. 

DR GO; GO: 0016020; C:membrane; lEA. 



DR GO; GO: 0005509; F: calcium ion binding; TEA. 

DR GO; GO: 0005215; F: transporter activity; lEA. 

DR GO; GO: 0006810; P:transport; lEA. 

DR InterPro; IPR000152; Asx_hydroxyl_S . 

DR InterPro; IPR000515; BPD_transp. 

DR InterPro; IPR001881; EGF_Ca . 

DR InterPro; IPR006209; EGF__like. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2, 

DR InterPro; IPR002035; VWF__A. 

DR Pfam; PF00047; ig; 47. 

DR SMART; SM00179; EGF_CA; 1, 

DR SMART; SM004 08; IGc2; 44. 

DR SMART; SM00327; VWA; 1, 

DR PROSITE; PSOOOlO; ASX_HYDROXYL ; 1, 

DR PROSITE; PS00402; BPD_TRANSP_INN_MEMBR; 1. 

DR PROSITE; PS01186; EGF_2 ; 1. 

DR PROSITE; PS01187; EGF_CA; 2. 

DR PROSITE; PS50835; IG_LIKE; 47. 

KW EGF-iike domain; Immunoglobulin domain; Signal. 

FT SIGNAL 1 24 POTENTIAL. 

FT CHAIN 25 5198 HEMICENTIN. 

SQ SEQUENCE 5198 AA; 570809 MW; DA851 1FF2B58D37B CRC64; 



Query Match 10.5%; Score 164.5; DB 5; Length 5198; 

Best Local Similarity 26.5%; Pred. No. 9.4e-05; 

Matches 62; Conservative 26; Mismatches 79; Indels 67; Gaps 10; 

Qy 4 9 AGGSSSNSTR EPPASGRVALVKVXDK WPLRS 7 9 

I I I I : I III: III: I : 

Db 595 AGGMSTRKMRLDIMEPPS VKVTPQDVYFNMREGVNLSCEAMGDPKPEVHWYFKG 64 8 



Qy 80 GGLQREQVISVGSCVPLERNQRYIFFL EPTEQPLVFKTAFAPLDTNGKNLKKEV 133 

I : II :::::: | | : I II I 
Db 64 9 RHLLNDYKYQVG QDSKFLYIRDATHHDEGTYECRAMSQAGQAJ^DTTDLML 698 

Qy 134 GKILCTDCATRPKLK— KMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDI 191 

MM:: : | | : j : : | : | : | | I II II : I I : I : I 
Db 699 ATPPKVEIIQNKMMVGR-GDRVSFECKTIRGKPHPKIRWFKNGKDLIKPDDY 749 

Qy 192 RIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSS 245 

II I : I I I II I I I M : I M I I I I : I 

Db 750 -IKINEG QLHIMGAKDEDAGAYSCVGENMAGKDVQVANLSVGRVPTIIES 798 



Search completed: August 17, 2004, 14:12:41 
Job time : 34.6911 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on: August 17, 2004, 13:56:40 ; Search time 8.5414 Seconds 

(without alignments) 
1816.670 Million cell updates/sec 

Title: US-09~8 64-675-4 

Perfect score: 1574 

1 MRRDPAPGFSMLLFGVSLAC KCPVGYTGDRCQQFAMVNFS 2 98 



Sequence : 
Scoring table: 



BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 141681 seqs, 52070155 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



141681 



Database 



SwissProt 42:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 



Result 
No. 


Score 


Query 
Match 


Length 


DB 


ID 


Description 


1 


1544 


98. 


,1 


756 


1 


NRG2 MOUSE 


P56974 


mus musculu 


2 


1505 


95, 


,6 


850 


1 


NRG2_HUMAN 


014511 


homo sapien 


3 


1470 


93. 


,4 


868 


1 


NRG2_RAT 


035569 


rattus norv 


4 


305.5 


19. 


,4 


602 


1 


NRG1_CHICK 


Q05199 


gallus gall 


5 


299.5 


19. 


.0 


677 


1 


NRG1_XENLA 


093383 


xenopus lae 


6 


299 


19, 


.0 


662 


1 


NRG1_RAT 


P43322 


r pro-neure 


7 


280.5 


17. 


.8 


639 


1 


NRG1_HUMAN 


Q02297 


h pro-neure 


8 


208.5 


13. 


.2 


623 


1 


VEIN_DROME 


Q94918 


drosophila 


9 


155 


9. 


.8 


298 


1 


JAM2_HUMAN 


P57087 


homo sapien 


10 


152 


9. 


.7 


296 


1 


SMDF_HUMAN 


Q15491 


homo sapien 


11 


150 


9. 


.5 


1217 


1 


EGF MOUSE 


P01132 


mus musculu 


12 


145.5 


9. 


.2 


2012 


1 


DSCA_HUMAN 


060469 


homo sapien 


13 


145 


9. 


.2 


338 


1 


LAMP_RAT 


Q62813 


rattus norv 


14 


144 


9, 


.1 


338 


1 


LAMP HUMAN 


Q13449 


homo sapien 


15 


141.5 


9, 


.0 


353 


1 


CEPU_CHICK 


Q90773 


gallus gall 


16 


135.5 


8, 


.6 


1356 


1 


VGR2_HUMAN 


P35968 


homo sapien 


17 


135 


8, 


.6 


4391 


1 


PGBM HUMAN 


P98160 


homo sapien 



18 


133. 5 


8 


,5 


1367 


1 


VGR2_M0USE 


P35918 


mus musculu 


19 


133,5 


8 


.5 


6632 


1 


UN89_CAEEL 


001761 


caenorhabdi 


20 


132. 5 


8 


.4 


1133 


1 


EGF_RAT 


P07522 


rattus norv 


21 


131.5 


8 


.4 


338 


1 


LAMP CHICK 


Q98919 


gallus gall 


22 


131.5 


8 


,4 


1343 


1 


VGR2_RAT 


008775 


rattus norv 


23 


131 


8 


.3 


345 


1 


OPCM_RAT 


P32736 


rattus norv 


24 


130. 5 


8 


.3 


824 


1 


MLTl HUMAN 


Q9udy8 


homo sapien 


25 


129. 5 


8 


.2 


837 


1 


NCM2_M0USE 


035136 


mus musculu 


26 


129.5 


8 


.2 


1018 


1 


CONT HUMAN 


Q12860 


homo sapien 


27 


129. 5 


8 


.2 


1064 


1 


FBP1_STRPU 


P10079 


strongyloce 


28 


129 


8 


.2 


345 


1 


OP CM HUMAN 


Q14982 


homo sapien 


29 


128 . 5 


8 


.2 


1010 


1 


CONT CHICK 


P14781 


gallus gall 


30 


128 . 5 


8 


.2 


1020 


1 


CONT MOUSE 


P12960 


mus musculu 


31 


128 . 5 


8 


.2 


1021 


1 


CONT_RAT 


Q63198 


rattus norv 


32 


128 . 5 


8 


.2 


1091 


1 


NCAl CHICK 


P13590 


gallus gall 


33 


128 


8 


, 1 


1040 


1 


AXOl HUMAN 


Q02246 


homo sapien 


34 


127 . 5 


8 


. 1 


761 


1 


NCA2 HUMAN 


P13592 


homo sapien 


35 


127 . 5 


8 


. 1 


848 


1 


NCAl HUMAN 


P13591 


homo sapien 


36 


127 


8 


. 1 


298 


1 


J7\M1 BOVIN 


Q9xt56 


bos taurus 


37 


127 


8 


. 1 


344 


1 


NTRI~RAT 


Q62718 


rattus norv 


38 


126. 5 


8 


. 0 


1040 


1 


AXOl RAT 


P22063 


rattus norv 


39 


125. 5 


8 


, 0 


853 


1 


NCAl BOVIN 


P31836 


bos taurus 


40 


125 . 5 


8 


. 0 


937 


1 


RORl HUMAN 


Q01973 


homo sapien 


41 


125 


7 


.9 


937 


1 


rori^'mouse 


Q9zl39 


mus musculu 


42 


124.5 


7 


.9 


1036 


1 


AX01_CHICK 


P28685 


gallus gall 


43 


124 


7 


.9 


344 


1 


NTRI_MOUSE 


Q99pj0 


mus musculu 


44 


123 


7 


. 8 


345 


1 


0PCM__B0VIN 


P11834 


bos taurus 


45 


122 


7 


. 8 


53 


1 


EGF PIG 


Q00968 


sus scrofa 



ALIGNMENTS 



RESULT 1 




NRG2 


MOUSE 




ID 


NRG2 MOUSE STANDARD; PRT; 756 AA. 




AC 


P56974; 




DT 


16-OCT-2001 (Rel. 40, Created) 




DT 


16-OCT-2001 (Rel. 40, Last sequence update) 




DT 


lO-OCT-2003 (Rel. 42, Last annotation update) 




DE 


Pro-neuregulin-2 precursor {Pro-NRG2) [Contains: Neuregulin-2 


(NRG- 2) 


DE 


(Divergent of neuregulin 1) (DON-1) ] . 




GN 


NRG2. 




OS 


Mus musculus (Mouse) . 




OC 


Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 


OC 


Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 


Mus . 


OX 


NCBI TaxID=10090; 




RN 


[1] 




RP 


SEQUENCE FROM N.A. (ISOFORMS NRG2-5; NRG2-10 AND NRG2-16A) . 




RC 


STRAIN=C57BL/6; TISSUE=Brain; 




RX 


MEDLINE=97311398; PubMed=9168 115 ; 




RA 


Carraway K.L. Ill, Weber J.L., Unger M.J., Ledesma J., Yu N., 




RA 


Gassmann M. , Lai C. ; 




RT 


"Neuregulin-2, a new ligand of ErbB3/ErbB4-receptor tyrosine 




RT 


kinases . " ; 




RL 


Nature 387:512-516(1997). 




RN 


[2] 





RP SEQUENCE OF 150-756 FROM N.A, (ISOFORMS DON- IM AND DON-IS). 

RC TISSUE=Choroid plexus; 

RX MEDLINE=97342638; PubMed=9199335; 

RA Busfield S.J., Michnick D.A., Chickering T.W., Revett T.L., Ma J., 

RA Woolf E.A. , Comrack C.A. , Dussault B.J., Woolf J., Goodearl A.D.J., 

RA Gearing D. P . ; 

RT "Characterization of a neuregulin-related gene, Don-1, that is highly 

RT expressed in restricted regions of the cerebellum and hippocampus."; 

RL Mol. Cell. Biol. 17:4007-4014(1997). 

CC -!- FUNCTION: Direct ligand for ERBB3 and ERBB4 tyrosine kinase 
CC receptors. Concomitantly recruits ERBBl and ERBB2 coreceptors, 

CC resulting in ligand-stimulated tyrosine phosphorylation and 

CC activation of the ERBB receptors. May also promote the 

CC heterodimerization with the EGF receptor. 

CC SUBCELLULAR LOCATION: EXISTS AS AN TYPE I MEMBRANE PROTEIN AND AS 

CC A PROTEOLYTICALLY RELEASED SOLUBLE GROWTH FACTOR FORM. THE 

CC MEMBRANE-BOUND FORM DOES NOT SEEM TO BE ACTIVE (BY SIMILARITY) . 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=4; 

CC Comment =Additional isoforms seem to exist; 

CC Name=NRG2-16A; 

CC IsoId=P56974-l; Sequence=Displayed; 

CC Name=DON-lM; 

CC IsoId=P56974-2; Sequence=VSP_003464 ; 

CC Name=DON-lS; Synonyms=NRG2-5 ; 

CC IsoId=P56974-3; Sequence=VSP_003462, VSP_003463; 

CC Name=NRG2-10; 

CC IsoId=P56974-4; Sequence=VSP_003460, VSP_003461; 

CC -!- TISSUE SPECIFICITY: Highest expression in the brain, with lower 
CC levels in the lung. In the cerebellum, found in granule and 

CC Purkinje cells. 

CC -!~ DOMAIN; The cytoplasmic domain may be involved in the regulation 
CC of trafficking and proteolytic processing. Regulation of the 

CC proteolytic processing involves initial intracellular domain 

CC dimerization (By similarity) . 

CC DOMAIN: ERBB receptor binding is elicited entirely by the EGF-like 

CC domain (By similarity) . 

CC -!- PTM: Proteolytic cleavage close to the plasma membrane on the 
CC external face leads to the release of the soluble growth factor 

CC form (By similarity) . 

CC -!- PTM: Extensive glycosylation precedes the proteolytic cleavage (By 
CC similarity) . 

CC -!- SIMILARITY: Contains 1 EGF-like domain. 

CC SIMILARITY: Contains 1 immunoglobulin-like C2-type domain. 

CC -!- SIMILARITY: Belongs to the neuregulin family. 

DR HSSP; Q12784; IHRE. 

DR MGD; MGI: 1098246; Nrg2 . 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; lEGF. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR InterPro; IPR002154; Neuregulin. 

DR Pfam; PF00008; EGF; 1. 

DR Pfam; PF00047; ig; 1. 

DR Pfam; PF02158; Neuregulin; 1. 

DR SMART; SM00181; EGF; 1. 

DR SMART; SM00408; IGc2; 1. 



DR 


PROSITE; 


PS00022; 


EGF_1 ; 


1. 




DR 


PROSITE; 


PS01186; 


EGF 2; 


1. 




DR 


PROSITE; 


PS50026; 


EGF_3 ; 


1, 






PROSITE; 


PS50835; 


IG LIKE; 1 




KW 


Growth factor; EGF 


-like 


domain; Immunoglobulin domain; Glycoprotein; 


KW 


Transmembrane ; Mul tigene 


family; Alternative splicing. 


FT 


PROPFP 


1 

X 


19 




RY SIMILARITY. 


FT 


PHATM 


20 


756 




PRO-NEUREGULIN-2 , MEMBRANE- BOUND FORM. 


FT 


PT-TATM 




O X 1 






FT 




9 n 


O X 




FyTRArFT.T.ULAR f POTENTIALS 


FT 




O X o 


O O D 




TNTFRMAT ^Tf;NATj SFOUFNCF. ^POTFNTIALS 






o o / 


7 

/ -J u 




rYTDPT.A^MTr f POTF.MTT ATi S 


FT 




1 4 S 

X 1 -J 


9 4 n 

i. T W 




Tn-T.TKF r9-TYPE 


FT 




O Q fi 

Zoo 


9 4 R 
Z 4 O 






I: 1 


JJUiYLfiJLlN 


9 4 Q 

Z 4 -3 


9 QD 

^ -7 U 




F(^F— T TTTF 
CjyjC XjX IvIZj ■ 


FT 




DZ / 


o o o 




POT. V— PRO 


FT 


jJXo Uijr X JJ 


X DO 


9 1 Q 

Z X Z7 




■RY CITMTTARTTY 
Ij X O XluX Xirt-iXX XX. 


FT 


HT c;TTT FTD 


9 R ^ 
^ J o 


2 67 




RY STMTT.ARTTY 

ox <D XI iX XJjVIAX ± J, • 


FT 


JJX o Kj Lie. X U 


Z D X 


97ft 




RY STMTT.ARTTY 


C I 


JjlbU XiC XU 


9 fi n 
Z 0 u 


9 P Q 
Z 0 




■RV c: TMTT ARTTV 
£3 X O XIYIX XLf\r\X XX. 


FT 




DO 






M_T TMKFD ff^T.riMAr ) ^POTF.NTTATJ 


FT 


a "D "R Pi T4V n 
L-AKrSUn I JJ 


X o o 


1 ft 
X 0 o 




M_T TMKFD / HT.PMAr \ fPDTFMTTAT.S 

IN XjXINxMiiU ^ OXjOIN^^j . . . y ^ IT W X HiiN X XJTXj / • 


r 1 


r* a "D R rM-T V n 


9^4 
Z O 'I 


9 "^4 

Z O ft 




M— T TMKFn ff^T.PMAP \ fPOTFNTTAT.l 

IN XjX IN r\.I-jX^ ^ VjXjV^lNrt^ . . * ) \ iTV-i' X HilN X XJtXj ; • 


FT 




9 Qfi 


9 Qfi 

27 D 




N-T.TNKFn fnLPNAr ) f POTENTIAL! 


FT 


V/\KoirXiXl^ 


9 ft n 


9 ft n 

z o u 




r Mn "i cin-form NRf^9-inS 
\^ yj \A-Li xouxl^xill in r\o^ x^jy > 


r 1 










/ftth=v<^p no '^4(^0 

/ r X X u — voir wvjotOv. 


FT 


VARSPLIC 


281 


756 




Missing (in isoform NRG2-10). 


FT 
I: 1 










/ FT I d=VS P_ 0 0 3 4 6 1 . 


n 1 


VARSPLIC 


9 ft 9 


330 




VGYTGDRronFAMVNFSKHLGFELKEVAEELYOKRVLTITGI 


FT 










CVALLWG -> NGFFGQRCLEKLPLRLYMPDPKQSVLWDT 


FT 










PGTGVSSSQWSTSPSTLDLN (in isoform DON-IS) . 


FT 










/FTId-VSP_0034 62. 


FT 


VARSPLIC 


331 


756 




Missing (in isoform DON-IS) . 


FT 










/FTId=VSP 003463. 


FT 


VARSPLIC 


282 


307 




VGYTGDRCQQFAMVNFSKHLGFELKE -> NGFFGQRCLEK 


FT 










LPLRLYMPDPKQK (in isoform DON-lM) . 


FT 










/FTId=VSP 003464. 


SQ 


SEQUENCE 


756 AA; 


82213 MW 


; 51D85DC918BE678E CRC64; 


Query Match 




98. 


1%; 


Score 1544; DB 1; Length 756; 


Best Local ; 


Similarity 


97. 


3%; 


Pred. No. 5e-116; 



Matches 290; Conservative 5; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I I I I I I I I I M I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I 
Db 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLAPAGGSSSNSTREP 60 

Qy 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTEQPLVFKTAFA 120 

Qy 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

I : I I I I I : I I I I I I I I I I I I I I I I I I M I I I I I I : I I I M I I I I I I I I I I I I I I I I M I 
Db 121 PVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAAAGNPQPSYRWFK 180 

Qy 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

I I I I I I M I I I I M I M I I I M I I M I I I : I I I I I I I I I I I I I I I I I I I I I I I I : I I I M 



Db 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVRVEDAGEYVCEAENILGKDTVRGRLHVN5VS 24 0 



Qy 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRCQQFAMVNFS 298 

I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 
Db 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRCQQFAMVNFS 298 

RESULT 2 
NRG2 HUMAN 



ID NRG2_HUMAN STANDARD; PRT; 8 50 AA, 

AC 014511; 

DT 15-DEC-1998 (Rel. 37, Created) 

DT 15-DEC-1998 (Rel. 37, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Pro-neuregulin-2 precursor {Pro-NRG2) [Contains: Neuregulin-2 (NRG-2) 

DE (Neural-and thymus -derived activator for ERBB kinases) (NTAK) 

DE (Divergent of neuregulin 1) (DON-1) ] . 

GN NRG2 OR NTAK. 

OS Homo sapiens (Human) , 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo, 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORM 1) . 

RC TISSUE=Neuroblastoma; 

RX MEDLINE=98006324; PubMed=9348101; 

RA Higashiyama S., Horikawa M. , Yamada K. , Ichino N., Nakano N. , 

RA Nakagawa T., Miyagawa J., Matsushita N., Nagatsu T., Taniguchi N., 

RA Ishiguro H. ; 

RT "A novel brain-derived member of the epidermal growth factor family 

RT that interacts with ErbB3 and ErbB4."; 

RL J. Biochem. 122:675-680(1997). 

RN [2] 

RP SEQUENCE FROM N.A. (ISOFORMS DON- IB AND DON-IR) . 

RC TISSUE=Fetal brain; 

RX MEDLINE=97342638; PubMed=9199335 ; 

RA Busfield S.J., Michnick D.A. , Chickering T.W., Revett T.L., Ma J., 

RA Woolf E.A. , Comrack C.A. , Dussault B.J., Woolf J., Goodearl A.D.J., 

RA Gearing D.P.; 

RT "Characterization of a neuregulin-related gene, Don-1, that is highly 

RT expressed in restricted regions of the cerebellum and hippocampus."; 

RL Mol. Cell. Biol. 17:4007-4014(1997). 

RN [3] 

RP SEQUENCE FROM N.A. (ISOFORMS 1; 2; 3; 4; 5 AND 6). 

RC TISSUE=Fetal brain, and Lung; 

RX MEDLINE=99295836; PubMed=10369162 ; 

RA Ring H.Z., Chang H., Guilbot A. , Brice A., LeGuern E., Francke U.; 

RT "The human neuregulin 2 (NRG2) gene: cloning, mapping and evaluation 

RT as a candidate for the autosomal recessive form of Charcot-Marie-Tooth 

RT disease linked to 5q."; 

RL Hum. Genet, 104:326-332(1999). 

CC -!- FUNCTION: Direct ligand for ERBB3 and ERBB4 tyrosine kinase 
CC receptors. Concomitantly recruits ERBBl and ERBB2 coreceptors, 

CC resulting in ligand-stimulated tyrosine phosphorylation and 

CC activation of the ERBB receptors. May also promote the 

CC heterodimerization with the EGF receptor. 

CC -!- SUBCELLULAR LOCATION: EXISTS AS AN TYPE I MEMBRANE PROTEIN AND AS 



CC A PROTEOLYTICALLY RELEASED SOLUBLE GROWTH FACTOR FORM, THE 

CC MEMBRANE-BOUND FORM DOES NOT SEEM TO BE ACTIVE (BY SIMILARITY) . 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isof orms=8 ; 

CC Name=l; 

CC Isold=014511-1; Sequence=Di splayed; 

CC Name=2 ; 

CC IsoId=014511-2; Sequence=VSP__003453 ; 

CC Name=3; 

CC IsoId=014511-3; Sequence=VSP_003455 ; 

CC Name=4; 

CC IsoId=014511-4; Sequence=VSP_003454 ; 

CC Name=5; 

CC IsoId=014511-5; Sequence-VSP_003458 , VSP_003459; 

CC Name=6; 

CC IsoId=Ol4511-6; Sequence=VSP_003456, VSP_003457; 

CC Name=DON-lB; 

CC IsoId=014511-7; Sequence=VSP_003452 , VSP_003455; 

CC Name=DON-lR; 

CC IsoId=014511-8; Sequence=VSP_003451 ; 

CC -!- TISSUE SPECIFICITY: Restricted to the cerebellum in the adult. 
CC DOMAIN: The cytoplasmic domain may be involved in the regulation 

CC of trafficking and proteolytic processing. Regulation of the 

CC proteolytic processing involves initial intracellular domain 

CC dimerization (By similarity) . 

CC -!- DOMAIN: ERBB receptor binding is elicited entirely by the EGF-like 
CC domain (By similarity) . 

CC PTM: Proteolytic cleavage close to the plasma membrane on the 

CC external face leads to the release of the soluble growth factor 

CC form (By similarity) . 

CC PTM: Extensive glycosylation precedes the proteolytic cleavage (By 

CC similarity) , 

CC -!- SIMILARITY: Contains 1 EGF-like domain. 

CC SIMILARITY: Contains 1 immunoglobulin-like C2-type domain. 

CC -!- SIMILARITY: Belongs to the neuregulin family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license(3isb-sib . ch) . 

CC 

DR EMBL; AB005060; BAA23417.1; -. 

DR EMBL; AF119162; AAF28848.1; 

DR EMBL; AF119151; AAF28848.1; JOINED. 

DR EMBL; AF119152; AAF28848.1; JOINED. 

DR EMBL; AF119153; AAF28848.1; JOINED. 

DR EMBL; AF119154; AAF28848.1; JOINED. 

DR EMBL; AF119155; AAF28848.1; JOINED. 

DR EMBL; AF119158; AAF28848.1; JOINED. 

DR EMBL; AF119159; AAF28848.1; JOINED. 

DR EMBL; AF119160; AAF28848.1; JOINED. 

DR EMBL; AF119161; AAF28848.1; JOINED. 

DR EMBL; AF119162; AAF28849.1; 

DR EMBL; AF119151; AAF28849.1; JOINED. 



DR 


EMBL, 


AF119152, 


r AAF28849 


. 1 


; JOINED. 


DR 


EMBL, 


AF119153, 


; AAF28849 


. 1 


; JOINED. 


DR 


EMBL, 


AF119154, 


'r AAF28849 


. 1 


? JOINED. 


DR 


EMBL, 


AF119156, 


r AAF28849 


. li 


; JOINED. 


DR 


EMBL, 


AF119158, 


; AAF28849 


. 1 


; JOINED. 


DR 


EMBL, 


AF119159, 


' AAF28849 


. 1 


; JOINED. 


DR 


EMBL, 


AF119160, 


• AAF28849 


. 1) 


; JOINED. 


DR 


EMBL, 


AF119161, 


• AAF28849 


. 1. 


; JOINED. 


DR 


EMBL, 


AF119162, 


' AAF28850 


. 1. 




DR 


EMBL, 


AF119151, 


; AAF28850 


. 1. 


; JOINED. 


DR 


EMBL, 


AF119152, 


• AAF28 850 


. 1. 


; JOINED. 


DR 


EMBL, 


AF119153, 


; AAF28850 


. li 


P JOINED. 


DR 


EMBL, 


AF119154, 


• AAF28850 


. ll 


; JOINED. 


DR 


EMBL, 


AF119155, 


' AAF28 850 


'Is 


; JOINED. 


DR 


EMBL, 


AF119157, 


• AAF28850 


. 1/ 


; JOINED. 


DR 


EMBL, 


AF119158, 


• AAF2 8 850 


. 1. 


r JOINED. 


DR 


EMBL, 


AF119159, 


• AAF28850 


. li 


; JOINED. 


DR 


EMBL, 


AF119160, 


' AAF28 8 50 


. 1. 


; JOINED. 


DR 


EMBL, 


AF119161, 


' AAF28 3 50 


. 1- 


? JOINED. 


DR 


EMBL, 


AF119162, 


' AAF28851 


. 1/ 




DR 


EMBL, 


AF119151, 


' 7^AF28851 


. li 


; JOINED. 


DR 


EMBL, 


' AF119152, 


' AAF28851 


. 1. 


; JOINED. 


DR 


EMBL, 


AF119153, 


• AAF28851 


. 1/ 


r JOINED. 


DR 


EMBL, 


AF119154, 


' AAF28851 


. li 


; JOINED. 


DR 


EMBL, 


• AF119156, 


• AAF28851 


. 1. 


; JOINED. 


DR 


EMBL, 


■ AF119157, 


■ AAF28851 


. 1/ 


r JOINED. 


DR 


EMBL, 


• AF1I9158, 


■ AAF28851 


. 1/ 


; JOINED. 


DR 


EMBL, 


• AF119159, 


■ AAF28851 


. 1- 


r JOINED. 


DR 


EMBL, 


• AF119160, 


' AAF28851 


. 1/ 


; JOINED. 


DR 


EMBL, 


• AF119161, 


; AAF28851 


. 1/ 


; JOINED. 


DR 


EMBL, 


• AF119158, 


■ AAF28852 


. 1, 




DR 


EMBL, 


' AF119151, 


• AAF2 8 852 


. li 


? JOINED. 


DR 


EMBL, 


' AF119152, 


■ AAF28 852 


. 1. 


; JOINED. 


DR 


EMBL, 


• AF119153, 


■ AAF28852 


. li 


; JOINED. 


DR 


EMBL, 


; AF119154, 


r AAF28 852 


. li 


; JOINED. 


DR 


EMBL, 


' AF119155, 


; AAF28852 


. li 


; JOINED. 


DR 


EMBL, 


1 AF119156, 


r AAF2 8 8 52 


. li 


; JOINED. 


DR 


EMBL, 


f AF119157, 


r AAF28 853 


. li 




DR 


EMBL, 


i AF119151, 


; AAF28853 


. li 


; JOINED. 


DR 


EMBL, 


; AF119152, 


f AAF28853 


. li 


; JOINED. 


DR 


EMBL, 


f AF119153, 


r AAF28853 


. li 


? JOINED. 


DR 


EMBL, 


; AF119154, 


f AAF28853 


. li 


; JOINED. 


DR 


EMBL, 


; AF119155, 


; AAF28853 


. li 


; JOINED. 


DR 


EMBL, 


; AF119156, 


; AAF28853 


.li 


; JOINED. 


DR 


PIR; 


JC5700; JC5700. 






DR 


HSSP; Q12784; IHRE. 






DR 


Genew; HGNC:7998; NRG2 . 






DR 


MIM; 


603818; -. 






DR 


GO; GO: 0005102; F: receptor 


binding; 


DR 


GO; GO:0007165; Prsignal 


transducti 


DR 


InterPro; IPR006209; EGF 


like. 


DR 


InterPro; IPR006210; lEGF. 




DR 


InterPro; IPR007110; Ig- 


like. 


DR 


InterPro; IPR003598; Ig 


c2 




DR 


InterPro; IPR002154; Neuregulin. 


DR 


Pfam; PF00008; 


EGF; 1. 






DR 


Pfarn; PF00047; 


ig; 1. 







FT 
FT 
FT 
FT 
FT 



Glycoprotein; 



DR Pfam; PF02158; Neuregulin; 1. 
DR SMART; SM00181; EGF; 1. 
DR SMART; SM00408; IGc2; 1. 
DR PROSITE; PS00022; EGF_1; 1. 
DR PROSITE; PS01186; EGF_2 ; 1. 
DR PROSITE; PS50026; EGF_3 ; 1. 
DR PROSITE; PS50835; IG_LIKE; 1. 

KW Growth factor; EGF-like domain; Immunoglobulin domain; 
KW Transmembrane; Multigene family; Alternative splicing. 

BY SIMILARITY. 

PRO-NEUREGULIN-2, MEMBRANE- BOUND FORM. 
NEUREGULIN-2 . 

EXTRACELLULAR (POTENTIAL) . 
INTERNJ^L SIGNAL SEQUENCE (POTENTIAL) . 
CYTOPLASMIC (POTENTIAL) . 
IG-LIKE C2-TYPE. 
SER/THR-RICH. 
EGF-LIKE. 
POLY-PRO. 
POLY-SER. 
POLY-SER. 
POLY-ALA. 
POLY-PRO. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
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FT 


DOMAIN 


lU 


lo 


FT 


DOMAIN 


20 


30 


FT 


DOMAIN 


33 


47 


FT 


DOMAIN 


87 


90 


FT 


DOMAIN 


721 


727 


FT 


DISULFID 


257 


311 


FT 


DISULFID 


345 


359 


FT 


DISULFID 


353 


370 


FT 


DISULFID 


372 


381 


FT 


CARBOHYD 


52 


52 


FT 


CARBOHYD 


53 


53 


FT 


CARBOHYD 


147 


147 


FT 


CARBOHYD 


278 


278 


FT 


CARBOHYD 


346 


346 


FT 


VARSPLIC 


1 


233 



(POTENTIAL) 
(POTENTIAL) 
(POTENTIAL) 
(POTENTIAL) 
(POTENTIAL) 



MRQVCCSALPPPPLEKGRCSSYSDSSSSSSERSSSSSSSSS 
ESGSSSRSSSNNSSISRPAAPPEPRPQQQPQPRSPAARRAA 
ARSRAAAAGGMRRDPAPGFSMLLFGVSLACYSPSLKSVQDQ 
AYKAPVWEGKVQGLVPAGGS S SNSTREPPASGRVALVKVL 
DKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPL 
VFKTAFAPLDTNGKNLKKEVGKILCTDC -> MSESRRRGR 



Query Match 95.6%; Score 1505; DB 1; Length 850; 

Best Local Similarity 98.6%; Pred. No. 7.6e-113; 

Matches 285; Conservative 1; Mismatches 3; Indels 0; 



Gaps 



0; 



Qy 

Db 



1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I I I I I I M I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 
93 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 152 



Qy 

Db 

Qy 

Db 



61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I M I I I I I I I I I I I I ri I I I I I I I I I I I I I I I I M M M I I I I I I I I M I I I I I I I I I M 

153 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 212 



121 



180 



PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 
I I I I I I I I I I I I I I I I I I I I M I I I I M I I I I I I I I M I I I I I I I I I I I I I I I M I I I I I 
213 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 272 



Qy 



181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 
I I I I I I I I M I I I I I I I I M I I I I I I I I I I I I M I I I I I I I I I I I I I I I I M I I I I I I I I 



Db 273 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVN5VS 332 



Qy 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 289 

I I I I I I I I I I I M I I M I I I I I I I I I I I I I I I I I I I I I I I I I : I II 
Db 333 TTLSSWSGHARKCNET7VKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 381 

RESULT 3 
NRG2 RAT 



ID NRG2_RAT STANDARD; PRT; 8 68 AA. 

AC 035569; 035073; 035570; 035571; 035572; 

DT 15-DEC-1998 (Rel. 37, Created) 

DT 15-DEC-1998 (Rel. 37, Last sequence update) 

DT lO-OCT-2003 (Rel. 42, Last annotation update) 

DE Pro-neuregulin-2 precursor (Pro-NRG2) [Contains: Neuregulin-2 (NRG-2) 

DE (Neural-and thymus-derived activator for ERBB kinases) (NTAK) ] . 

GN NRG2 OR NTAK. 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A., SEQUENCE OF 128-162, AND ALTERNATIVE SPLICING. 

RX MEDLINE=98006324; PubMed=934 8101; 

RA Higashiyama S., Horikawa M. , Yamada K. , Ichino N., Nakano N. , 

RA Nakagawa T., Miyagawa J., Matsushita N., Nagatsu T., Taniguchi N., 

RA Ishiguro H. ; 

RT "A novel brain-derived member of the epidermal growth factor family 

RT that interacts with ErbB3 and ErbB4."; 

RL J. Biochem. 122:675-680(1997). 

RN [2] 

RP SEQUENCE OF 109-868 FROM N.A. (ISOFORMS 6 AND 7) . 

RC TISSUE=Cerebellum; 

RX MEDLINE=97311397; PubMed=9168114 ; 

RA Chang H., Riese D.J. II, Gilbert W., Stern D.F., McMahan U.J.; 

RT "Ligands for ErbB-family receptors encoded by a neuregulin-like 

RT gene. "; 

RL Nature 387:509-512(1997). 

CC -!- FUNCTION: Direct ligand for ERBB3 and ERBB4 tyrosine kinase 

CC receptors. Concomitantly recruits ERBBl and ERBB2 coreceptors, 

CC resulting in ligand-stimulated tyrosine phosphorylation and 

CC activation of the ERBB receptors. May also promote the 

CC heterodimerization with the EGF receptor. 

CC SUBCELLULAR LOCATION: EXISTS AS AN TYPE I MEMBRANE PROTEIN AND AS 

CC A PROTEOLYTICALLY RELEASED SOLUBLE GROWTH FACTOR FORM. THE 

CC MEMBR7\NE-B0UND FORM DOES NOT SEEM TO BE ACTIVE (BY SIMILARITY) . 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=7; 

CC Comment=Additional isoforms seem to exist. The alpha-type and 

CC beta-type differ in the EGF-LIKE domain; 

CC Name=l; S ynonyms=NTAK- alpha 1 ; 

CC Isold=035569-1; Sequence=Displayed; 

CC Name=2; Synonyms=NTAK-alpha2A; 

CC IsoId=035569-2; Sequence=VSP_003471 ; 

CC Name=3; Synonyms=NTAK-alpha2B, NTAK-alpha2-lP ; 

CC IsoId=035569-3; Sequence=VSP_0034 66, VSP_003471; 

CC Name=4; Synonyms=NTAK-beta ; 



CC IsoId=035569-4; Sequence=VSP_00347 0 ; 

CC Name=5; S ynonyms=NTAK- gamma ; 

CC IsoId=035569-5; Sequence=VSP_003467 , VSP_003468; 

CC Name=6; Synonyms=NRG2-alpha ; 

CC IsoId=035569-6; Sequence=VSP_003472 , VSP_003473; 

CC Name=7; Synonyms=NRG2-beta ; 

CC IsoId=035569-7; Sequence=VSP_003465, VSP_003469; 

CC -!- TISSUE SPECIFICITY: Expressed in most parts of the brain, 

CC especially the olfactory bulb and cerebellum where it localizes in 

CC granule and Parkin je cells. In the hippocampus, found in the 

CC granule cells of the dentrate gyrus. In the basal forebrain, found 

CC in the cholinergic cells. In the hindbrain, weakly detectable in 

CC the motor trigeminal nucleus. Not detected in the hypothalamus. 

CC Also found in the liver and in the thymus. Not detected in heart, 

CC adrenal gland, or testis. 

CC DEVELOPMENTAL STAGE: In the embryo, expressed in the brain of 

CC ell. 5 embryos where it is found in the telencephalon, but not in 

CC the hindbrain. Not found in the heart. In the adult, found in 

CC brain and thymus. 

CC -!- DOMAIN: The cytoplasmic domain may be involved in the regulation 
CC of trafficking and proteolytic processing. Regulation of the 

CC proteolytic processing involves initial intracellular domain 

CC dimerization (By similarity) . 

CC -!- DOMAIN: ERBB receptor binding is elicited entirely by the EGF-like 
CC domain (By similarity) . 

CC -!- PTM: Proteolytic cleavage close to the plasma membrane on the 
CC external face leads to the release of the soluble growth factor 

CC form (By similarity) . 

CC -!- PTM: Extensive glycosylation precedes the proteolytic cleavage (By 
CC similarity) . 

CC -!- SIMILARITY: Contains 1 EGF-like domain. 

CC -!- SIMILARITY: Contains 1 immunoglobulin-like C2-type domain. 
CC SIMILARITY: Belongs to the neuregulin family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; D89995; BAA23344.1; -. 

DR EMBL; D89996; BAA23345.1; -. 

DR EMBL; D89997; B7\A23346.1; -. 

DR EMBL; D89998; BA7V23347.1; -. 

DR EMBL; AB001576; BAA23348.1; 

DR PIR; JC5701; JC5701. 

DR PIR; JC5702; JC5702. 

DR HSSP; Q12784; IHRE. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; lEGF. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR InterPro; IPR002154; Neuregulin. 

DR Pfam; PF00008; EGF; 1. 

DR Pfam; PF00047; ig; 1. 



DR 


Pfam; PF02158; Neuregulin 




1. 






DR 


SMART; SM00181; EGF; 1. 










DR 


SMART; SM00408; IGc2; 1. 










DR 


PROSITE; 


PS00022; 


EGF 1; 


1 










PROSITE; 


PS01186; 


EGF_2 ; 


1 










PROSITE; 


PS50026; 


EGF 3; 


1 








LJX\ 


PROSITE; 


PS50835; 


IG LIKE 


; 


1. 






i\V¥ 


Growth factor; EGF-like domain; Iminunoglobulin domain; Glycoprotein; 


J\vV 


Transmembrane; Multigene 


family; Alternative splicing. 
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Missing (in isoform 


2 and isoform 3) . 


FT 












/FTId=VSP 003471. 
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6) . 


FT 
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FT 


CONFLICT 


117 


117 






S -> F (IN REF. 2) . 




FT 


CONFLICT 


724 


724 






R -> H (IN REF. 2) . 




SQ 


SEQUENCE 


868 AA 


; 93776 


MW; 3C7D4D94DBE64DE2 CRC64; 



Query Match 93.4%; Score 1470; DB 1; Length 868; 

Best Local Similarity 95.8%; Pred. No. 4.9e-110; 

Matches 277; Conservative 5; Mismatches 7; Indels 0; Gaps 



0; 



Qy 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I I I I M I I I I I I I I I I I I I I I M I M I I I I II I I I I I I I 

Db 109 MRRDPAPGSSMLLFGVSLACYSPSLKSVQDQAYKAPWVEGKVQGLAPAGGSSSNSTREP 168 

Qy 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 169 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTEQPLVFKTAFA 228 

Qy 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

I : I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I M : I I I I M I I I I I I I I I I I I I I I I I I 
Db 229 PVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAAAGNPQPSYRWFK 288 

Qy 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

I I I I M M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I 
Db 289 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLHVNSVS 348 

Qy 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 289 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I : I II 
Db 349 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 397 



RESULT 4 
NRG1_CHICK 

ID NRG1_CHICK STANDARD; PRT; 602 AA. 

AC Q05199; 073750; 073751; 073752; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT lO-OCT-2003 (Rel. 42, Last annotation update) 

DE Pro-neuregulin-1 precursor (Pro-NRGl) [Contains: Neuregulin-1 

DE (Acetylcholine receptor inducing activity) (ARIA) ] . 

GN NRGl OR ARIA. 

OS Gallus gallus (Chicken) , 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 

OC Gallus. 

OX NCBI_TaxID=9031 ; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORM 1), AND PARTIAL SEQUENCE. 

RC STRAIN=White leghorn; TISSUE^Brain; 

RX MEDLINE=93201602; PubMed=84 5367 0 ; 

RA Falls D.L., Rosen K.M., Corfas G. , Lane W.S,, Fischbach G.D.; 

RT "ARIA, a protein that stimulates acetylcholine receptor synthesis, is 

RT a member of the neu ligand family."; 

RL Cell 72:801-815(1993). 

RN [2] 

RP SEQUENCE FROM N.A. (ISOFORMS 2; 3 AND 4). 

RC TISSUE=Brain, and Spinal cord; 

RX MEDLINE=98150951; PubMed=94 91987 ; 

RA Yang X., Kuo Y., Devay P., Yu C, Role L. ; 

RT "A cysteine-rich isoform of neuregulin controls the level of 

RT expression of neuronal nicotinic receptor channels during 

RT synaptogenesis . " ; 

RL Neuron 2 0:255-270(1998). 



CC FUNCTION: Direct ligand for the ERBB tyrosine kinase receptors. 

CC The multiple isoforms perform diverse functions: Cystein-rich 

CC domain containing isoforms (isoforms 2-4) probably regulate the 

CC expression of nicotinic acetylcholine receptors at developing 

CC interneuronal synapses. The Ig-NRG isoform is required for the 

CC initial induction and/or maintenance of the mature levels of 

CC acetylcholine receptors at neuromuscular synapses. 

CC -!- SUBCELLULAR LOCATION: Exists as a type I membrane protein and as a 
CC proteolytically released soluble growth factor form. The membrane- 

CC bound form does not seem to be active (By similarity) . 

CC ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=4; 

CC Comment=Additional isoforms seem to exist; 

CC Name=l; Synonyms=ARIA, IG-NRG; 

CC IsoId=Q05199-l; Sequence=Di splayed; 

CC Note=Contains an Ig-like domain; 

CC Name=2 ; Synonyms=CRD-NRG-BETAlA; 

CC IsoId=Q05199-2; Sequence=VSP_003445 ; 

CC Note=The EGF-like domain is replaced by a Cys teine-rich domain 

CC (CRD) ; 

CC Name=3; Synonyms=CRD-NRG-BETA2A; 

CC IsoId=Q05199-3; Sequence=VSP_003445, VSP_003446; 

CC Note=The EGF-like domain is replaced by a Cysteine-rich domain 

CC (CRD); 

CC Name=4; Synonyms=CRD-NRG-BETA2B; 

CC IsoId-Q05199-4; Sequence=VSP_003445, VSP_003446, VSP_003447, 

CC VSP_003448; 

CC Note=The EGF-like domain is replaced by a Cysteine-rich domain 

CC (CRD) ; 

CC -!- DEVELOPMENTAL STAGE: Isoforms 2-4 are detected at embryonic day 4 
CC (ED4) in both visceral and somatic motor neurons of spinal cord 

CC and is highest at ED6. Isoform 1 is not expressed until ED 6 in 

CC spinal cord. At ED 11 both isoforms display comparable levels. 

CC -!- DOMAIN: The cytoplasmic domain may be involved in the regulation 
CC of trafficking and proteolytic processing. Regulation of the 

CC proteolytic processing involves initial intracellular domain 

CC dimerization (By similarity) . 

CC -!- DOMAIN: ERBB receptor binding is elicited entirely by the EGF-like 
CC domain. 

CC -!- PTM: Proteolytic cleavage close to the plasma membrane on the 
CC external face leads to the release of the soluble growth factor 

CC form. 

CC -!- PTM: Extensive glycosylation precedes the proteolytic cleavage (By 
CC similarity) . 

CC -!- SIMILARITY: Contains 1 EGF-like domain. 

CC -!- SIMILARITY: Contains 1 immunoglobulin-like C2-type domain. 

CC -!- SIMILARITY: Belongs to the neuregulin family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; L11264; AAA49037.1; -. 



DR EMBL; AF045654; AAC05670.1; -. 

DR ENBL; AF045655; AAC05671.1; 

DR EMBL; AF045656; AAC05672.1; 

DR PIR; A45769; A45769. 

DR HSSP; Q12784; IHRE. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; lEGF. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR InterPro; IPR002154; Neuregulin. 

DR Pfam; PF00047; ig; 1. 

DR Pfam; PF02158; Neuregulin; 1. 



DR 


PRINTS; PR01089; 


NEUREGULIN. 




DR 


SMART; SM00181; EGF; 1. 




DR 


SMART; SM00408; IGc2; 1, 




DR 


PROSITE; 


PS00022; 


EGF 1; 1. 




DR 


PROSITE; 


PS01186; 


EGF 2; FALSE_NEG. 


DR 


PROSITE; 


PS50026; 


EGF 3; 1. 




DR 


PROSITE; 


PS50835; 


IG LIKE; 1 




KW 


Growth factor; EGF-like domain; Immunoglobulin domain; Glycoprotein; 


KW 


Transmembrane; Alternative splicing. 


FT 


CHAIN 


1 




"DTDO "KT'CTT'DTrr'TTT TM— 1 MT^MTl'DaMT? — "ROTTKm F'PiDM 

FKO— NhjUKhjtjUJLilN 1, JrliiiinDKrtJNiij oUUJNJJ rUKln. 


FT 


CHAIN 


1 




NbU KhiCjU LilN — 1 . 


FT 


DOMAIN 


-1 
1 


Z U D 




FT 


TRANSMEM 




O O Q 

z z y 




FT 


DOMAIN 


o Q n 


duz 


P VTPiPT A QMT r* ^ PriTPTJTT AT. \ 


FT 


DOMAIN 




IZ o 




FT 


DOMAIN 


IOC, 


± O D 


oxi*K/ irir\ r\X^^n. 


FT 




137 


181 


EGF-LIKE . 


FT 


DISULFID 


49 


105 


BY SIMILARITY. 


FT 


DISULFID 


141 


155 


BY SIMILARITY. 


FT 


DISULFID 


149 


169 


BY SIMILARITY. 


FT 


DISULFID 


171 


180 


BY SIMILARITY. 


FT 


CARBOHYD 


21 


21 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


113 


113 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


126 


126 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


VARSPLIC 


1 


127 


MWATSEGPLQYSLAPTQTDVNSSYNTVPPKLKEMKNQEVAV 


TTim 

r i 








GQKLVLRCETTSEYPALRFKWLKNGKEITKKNRPENVKIPK 


FT 








KQKKYSELHIYRATLADAGEYACRVSSKLGNDSTKASVIIT 


FT 








DTNA -> MSEVGTETFPSPSAQLSPDASLGGLPAEENMPG 


FT 








PHREDSRVPGVAGLASTCCVCLEAERLKGCLNSEKICIAPI 


FT 








LACLLSLCLCIAGLKWVFVDKIFEYDSPTHLDPGRIGQDPR 


FT 








S T VD PTAL S AWVP S EVYAS P FP I P S LE S KAEVT VQT D S S LV 


FT 








PSRPFLQPSLYNRILDVGLWSSATPSLSPSSLEPTTASQAQ 


FT 








ATETNLQTAPKLS (in isoform 2, isoform 3 


FT 








and isoform 4 ) . 


FT 








/ FT I d= VS P_0 0 3 4 4 5 . 


FT 


V7VRSPLIC 


191 


198 


Missing (in isoform 3 and isoform 4). 


FT 








/FTId=VSP_00344 6. 


FT 


VARSPLIC 


388 


405 


VSAMTTPARMSPVDFHTP -> HTPPTSLLLAGKVSLRVS 


FT 








(in isoform 4) . 


FT 








/FTId=VSP_003447. 


FT 


VARSPLIC 


406 


602 


Missing (in isoform 4) . 


FT 








/FTId=VSP 003448. 


SQ 


SEQUENCE 


602 AA; 67453 MW 


; 4183C0E56CE5D346 CRC64; 



Query Match 



19.4%; Score 305,5; DB 1; Length 602; 



Best Local Similarity 33.0%; Pred. No. 5.7e-17; 

Matches 65; Conservative 37; Mismatches 74; Indels 21; Gaps 5; 



Qy 109 TEQPLVFKTAFAPLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAA 168 

: I I I : I II : I I I I : I I : I I I : I I : I I 

Db 5 SEGPLQYSLAPTQTDVNS SYNTVPPKLKEMKNQEVAVGQKLVLRCETT 52 

Qy 169 AGNPQPSYRWFKDGKEL NRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENIL 225 

: I ::| |:tM: II :::| :l I I : : Mill I : I 

Db 53 SEYPALRFKWLKNGKEITKKNRPENVKIP-KKQKKYSELHIYRATLADAGEYACRVSSKL 111 

Qy 226 GKDTVRGRLYVNSVSTTLSSWSG — HARKCNETAKSYCVNGGVCYYIEGI NQLSCKC 280 

II:: : : : | : | : I I || : I : : II I II I I : : : : I : I 

Db 112 GNDSTKASVIITDTNATSTSTTGTSHLTKCDIKQKAFCVNGGECYMVKDLPNPPRYLCRC 171 

Qy 281 PVGYTGDRCQQFAMVNF 2 97 

I : I I I I I I : I : I 
Db 172 PNEFTGDRCQNYVMASF 188 



RESULT 5 
NRGl XENLA 



ID NRG1_XENLA STANDARD; PRT; 677 AA. 

AC 093383; Q9W6N0; 

DT 16-OCT-2001 {Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Pro-neuregulin-1 precursor (Pro-NRGl) [Contains: Neuregulin-1] . 

GN NRGl. 

OS Xenopus laevis (African clawed frog) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Amphibia; Batrachia; Anura; Mesobatrachia ; Pipoidea; Pipidae; 

OC Xenopodinae; Xenopus. 

OX NCBI_TaxID=8355; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORM ALPHAl), AND ALTERNATIVE SPLICING. 

RX MEDLINE=98352126; PubMed=9685585 ; 

RA Yang J.F., Zhou H., Pun S., Ip N.Y., Peng H.B., Tsim K.W.K.; 

RT "Cloning of cDNAs encoding xenopus neuregulin: expression in myotomal 

RT muscle during embryo development."; 

RL Brain Res. Mol . Brain Res . 58:59-73(1998). 

RN [2] 

RP SEQUENCE FROM N.A. (ISOFORM CRD). 

RX MEDLINE=99316087; PubMed=10383827 ; 

RA Yang J.F., Zhou H., Choi R.C., Ip N.Y., Peng H.B., Tsim K.W.K.; 

RT "A cysteine-rich form of Xenopus neuregulin induces the expression of 

RT acetylcholine receptors in cultured myotubes."; 

RL Mol. Cell. Neurosci. 13:415-429(1999). 

CC -!- FUNCTION: Direct ligand for the ERBB tyrosine kinase receptors. 
CC Induces expression of acetylcholine receptor in synaptic nuclei. 

CC -!- SUBCELLULAR LOCATION: Exists as a type I membrane protein and as a 
CC proteolytically released soluble growth factor form. The membrane- 

CC bound form does not seem to be active (By similarity) . 

CC ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=2; 

CC Comment=Additional isoforms seem to exist. Isoforms have alpha 

CC or beta-type EGF-like domains; 



CC Name=Alphal ; 

CC Isold=093383-1; Sequence=Displayed; 

CC Name=CRD; Synonyms=CRD-NRGl, Cysteine rich domain; 

CC IsoId=093383-2; Sequence=VSP_003449 , VSP_003450; 

CC TISSUE SPECIFICITY: Isoform alphal is expressed in brain and 

CC muscle. Isoform CRD is expressed in brain and spinal cord, but at 

CC very low level in muscle. 

CC DEVELOPMENTAL STAGE: Strong expression in developing brain and 

CC spinal cord of the embryo. Also expressed in the myotomal muscle. 

CC DOMAIN: The cytoplasmic domain may be involved in the regulation 

CC of trafficking and proteolytic processing. Regulation of the 

CC proteolytic processing involves initial intracellular domain 

CC dimerization (By similarity) . 

CC DOMAIN: ERBB receptor binding is elicited entirely by the EGF-like 

CC domain. 

CC -!- PTM: Proteolytic cleavage close to the plasma membrane on the 

CC external face leads to the release of the soluble growth factor 

CC form. 

CC -!- PTM: Extensive glycosylation precedes the proteolytic cleavage (By 

CC similarity) . 

CC -!- SIMILARITY: Contains 1 EGF-like domain. 

CC SIMILARITY: Contains 1 immunoglobulin-like C2-type domain. 

CC SIMILARITY: Belongs to the neuregulin family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 



CC 

DR EMBL; AF076618; AAC26804.1; -. 

DR EMBL; AF142632; AAD33893.1; 

DR HSSP; Q12784; IHRE. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; lEGF. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2 . 

DR InterPro; IPR002154; Neuregulin. 

DR Pfam; PF00008; EGF; 1. 

DR Pfam; PF00047; ig; 1. 

DR Pfam; PF02158; Neuregulin; 1. 

DR PRINTS; PR01089; NEUREGULIN. 

DR SMART; SM00181; EGF; 1. 

DR SMART; SM00408; IGc2; 1. 

DR PROSITE; PS00022; EGF_1; 1. 

DR PROSITE; PS01186; EGF_2; 1. 

DR PROSITE; PS50026; EGF_3; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

KW Growth factor; EGF-like domain; Immunoglobulin domain; Glycoprotein; 

KW Transmembrane; Alternative splicing. 



FT 


CHAIN 


1 


259 


NEUREGULIN ALPHAl (BY SIMILARITY) . 


FT 


CHAIN 


1 


677 


PRO-NEUREGULIN ALPHAl, MEMBRANE- BOUND 


FT 








FORM (BY SIMILARITY) . 


FT 


DOMAIN 


1 


260 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


261 


280 


INTERNAL SIGNAL SEQUENCE (POTENTIAL) . 



T.'in 

r i 


UUiVlAllN 


^: o 1 


Oil 




r i 




J 1 






FT 


DOMAIN 


loo 


O '5 O 




FT 


DI SULFID 


3 / 


llo 


nv QT"N/rTT ARTTV 


FT 


DISULFID 


192 


one 
2 U D 


UV GTMTT ZVPTTV 

ni oljyilJ-i/\KXl 1 . 


FT 


DISULFID 


200 


o o n 

220 


D\7 CTIUTTT TiDTTV 

BY blMlijAKX i I . 


FT 


DISULFID 


o o o 
222 


Zol 


lav QTMTTZiPTTV 


FT 


DOMAIN 


1 


2 b 


T VG OT 


FT 


CARBOHYD 


124 


1 O /I 

124 


\j T T"KTVT?Pt ( CJ C'hTis.r' \ / pr^T'F'M'PT AT. \ 
N~LilNi\rjJJ ^ vaJ_iL-IM/\L. . . . ) \ rUl lliiN i x/vu / • 


FT 


CARBOHYD 


ion 
loU 




M T TMWpn ^ (^T PMAr* ^ ^ PnTFWTT ATi ^ 


FT 


VARSPLIC 


1 


136 


MAEKKKVKhjCjKC:iKK(jJ\bKJ\UKJ\(jJ\n^ rr\.ur\iit 


FT 








j_ j\ j_ (Jo v(j(JilioJ\JAij V iji\L-vAv o Cj'i^iro JbA.r r\w r i\'Njxur\.iiix ortruN i\ 


FT 










FT 










FT 








T nMTFQVPFTnriFFFTTHrZTTriT ATTPPVPT.F.AnRTiRTCTiN 


FT 








oT?i^'Tr'TTt>TT ar*! tqt pt r*Tar*T TfTxrv/'FX/TtKTFFYn'^PTHT.r) 

Oil)l\XL,XXirX XltVv^XiX OXi(^Xi^^X/\(oXii\¥* vr VXfi\xri-jxx^*->xXiixix^ 


FT 








IT (jilKtjyiJljJL Xi X 1 JJ lAir o iXiVrDOVrsiXiirVXXirl i JJOivrvrw x 


r i 








FKFGTSLLPTE (in isoform CRD) . 


FT 








/FTId=VSP 003449. 


FT 


VARSPLIC 


223 


252 


KPGFTGTU^CTETDPLRWRSEKHLGIEFME -> PNEFTGD 


FT 








RCQNYVMASFYK (in isoform CRD) . 


FT 








/FTId=VSP 003450. 


SQ 


SEQUENCE 


677 AA; 


75794 


MW; 49279E8F5BAE396F CRC64 ; 



Query Match 19.0%; Score 299.5; DB 1; 

Best Local Similarity 33.8%; Pred. No. 2e-16; 
Matches 74; Conservative 24; Mismatches 68; 



Length 677; 
Indels 53; 



Gaps 



Qy 

Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 



126 



15 



GKNLKKEVGKIL CTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDG 

II I II I II II : : I : I : I I : I II I : I : I : I I II 

GKGKKDRKGKKAEGSDQGAAASPKLKEIKTQSVQEGKKLVLKCQAVSEQPSLKFRWFKGE 



5; 



182 



74 



230 



183 KELNR SRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTV 

II: I : II : I : I II : I I I I I I I I I I M 

75 KEIGAKNKPDSKPEHIKIRGKKKSSELQISKASSADNGEYKCMVSNQLGNDTVTVNVTIV 134 

231 RGRLYVNSVSTTL SSWSGHT^lRKCNE 255 

: I I II :: ' M ll:: 

135 PKPTYNHLLLMKIYLKVTSVEKSVEPSTLNLLESQKEVIFATTKRGDTTAGPGHLIKCSD 194 

256 TAKSYCVNGGVCYYIEGI NQLSCKCPVGYTGDRCQQ 291 

hllllll II : II II III l:ll II : 
195 KEKTYCVNGGECYVLNGITSSNQFMCKCKPGFTGARCTE 233 



RESULT 6 
NRG1_RAT 

ID NRG1_RAT STANDARD; PRT; 662 AA. 

AC P43322; P43323; P43324; P43325; P43326; P43327; P43328; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT lO-OCT-2003 (Rel. 42, Last annotation update) 

DE Pro-neuregulin-1 precursor (Pro-NRGl) [Contains: Neuregulin-1 (Neu 
DE differentiation factor) (Heregulin) (HRG) (Acetylcholine receptor 
DE inducing activity) (ARIA) (Sensory and motor neuron-derived factor) 
DE (Glial growth factor) ] . 
GN NRGl OR NDF. 



OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mairmialia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A., AND ALTERNATIVE SPLICING. 

RC TISSUE= Fibroblast; 

RX MEDLINE=94158863; PubMed=750944 8 ; 

RA Wen D., Suggs S.V., Karunagaran D., Liu N., Cupples R.L., Luo Y., 

RA Janssen A.M., Ben-Baruch N., Trollinger D.B., Jacobsen V.L., 

RA Meng S.-Y., Lu H.S., Hu S., Chang D., Yang W., Yanigahara D., 

RA Koski R.A., Yarden Y.; 

RT "Structural and functional aspects of the multiplicity of Neu 

RT differentiation factors."; 

RL Mol. Cell. Biol. 14:1909-1919(1994). 

RN [2] 

RP SEQUENCE FROM N.A. (ISOFORM ALPHA2C/NDF4 4 ) , AND PARTIAL SEQUENCE. 

RC TISSUE=Fibroblast; 

RX MEDLINE=92257596; PubMed=134 9853; 

RA Wen D., Peles E., Cupples R., Suggs S.V., Bacus S.S., Luo Y., 

RA Trail G., Hu S., Silbiger S.M., Levy R.B., Koski R.A. , Lu H.S., 

RA Yarden Y. ; 

RT "Neu differentiation factor: a transmembrane glycoprotein containing 

RT an EGF domain and an immunoglobulin homology unit."; 

RL Cell 69:559-572(1992). 

RN [3] 

RP SEQUENCE OF 14-36. 

RX MEDLINE=92208945; PubMed=134 8215; 

RA Peles E., Bacus S.S., Koski R.A,, Lu H.S., Wen D., Ogden S.G., 

RA Levy R.B., Yarden Y. ; 

RT "Isolation of the neu/HER-2 stimulatory ligand: a 44 kd glycoprotein 

RT that induces differentiation of mammary tumor cells."; 

RL Cell 69:205-216(1992). 
RN [4] 

RP REGULATION OF PROCESSING ( ISOFORM ALPHA2C/NDF44 ) . 

RX MEDLINE=99069430; PubMed=98 52099; 

RA Liu X., Hwang H., Cao L. , Wen D., Liu N. , Graham R.M., Zhou M. ; 

RT "Release of the neuregulin functional polypeptide requires its 

RT cytoplasmic tail."; 

RL J. Biol. Chem. 273:34335-34340(1998). 
RN [5] 

RP INTERACTION WITH LIMKl . 

RX MEDLINE=:98352096; PubMed=9685409; 

RA Wang J.Y., Frenzel K.E., Wen D., Falls D.L.; 

RT "Transmembrane neuregulins interact with LIM kinase 1, a cytoplasmic 

RT protein kinase implicated in development of visuospatial cognition."; 

RL J. Biol. Chem. 273:20525-20534(1998). 

CC FUNCTION: Direct ligand for ERBB3 and ERBB4 tyrosine kinase 

CC receptors. Concomitantly recruits ERBBl and ERBB2 coreceptors, 

CC resulting in ligand-stimulated tyrosine phosphorylation and 

CC activation of the ERBB receptors. The multiple isoforms perform 

CC diverse functions such as inducing growth and differentiation of 

CC epithelial, glial, neuronal, and skeletal muscle cells; inducing 

CC expression of acetylcholine receptor in synaptic vessicles during 

CC the formation of the neuromuscular junction; stimulating 

CC lobuloalveolar budding and milk production in the mammary gland 

CC and inducing differentiation of mammary tumor cells; stimulating 



CC Schwann cell proliferation; implication in the development of the 

CC myocardium such as trabeculation of the developing heart (By 

CC similarity) . 

CC -!- SUBUNIT: The cytoplasmic domain interacts with the LIM domain 

CC region of LIMKl , 

CC SUBCELLULAR LOCATION: Exists as a type I membrane protein and as a 

CC proteolytically released soluble growth factor form. The membrane- 

CC bound form does not seem to be active. 

CC ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=8; 

CC Comment=Additional isoforms seem to exist; 

CC Name=Beta4; Synonyms=NDF42A; 

CC IsoId=P43322-l ; Sequence=Displayed; 

CC Name=Alpha2A; Synonyms =NDF3 8 ; 

CC IsoId=P43322-2; Sequence=VSP_0034 36; 

CC Name=Alpha2B; Synonyms=NDF19 ; 

CC IsoId=P43322-3; Sequence=VSP_003436, VSP_003443, VSP_003444; 

CC Name=Alpha2C; Synonyms=NDF44 ; 

CC IsoId=P43322-4; Sequence=VSP_003436, VSP_003442; 

CC Name=Betal; 

CC IsoId=P43322-5; Sequence=VSP_003437 ; 

CC Name=Beta2; Synonyms=NDF4 0; 

CC IsoId=P43322-6; Sequence=VSP_00344 0, VSP_003441; 

CC Name=BetA2A; Synonyms=NDF22 ; 

CC IsoId=P43322-7; Sequence=VSP_003440; 

CC Name=Beta3; Synonyms=NDF4 ; 

CC IsoId=P43322-8; Sequence=VSP_003438, VSP_003439; 

CC -!- TISSUE SPECIFICITY: Widely expressed. Most tissues contain alpha2A 

CC and alpha2B isoforms. Alpha2 and beta2 are the predominant forms 

CC in mesenchymal and nonneuronal organs. Betal is enriched in brain 

CC and spinal cord, but not in muscle and heart. Alpha2C is highly 

CC expressed in spinal cord, moderately in lung, brain, ovary, and 

CC stomach, in low amounts in the kidney, skin and heart and not 

CC detected in the liver, spleen, and placenta. 

CC DOMAIN: The cytopasmic domain may be involved in the regulation of 

CC trafficking and proteolytic processing. Regulation of the 

CC proteolytic processing involves initial intracellular domain 

CC dimerization. 

CC -!- DOMAIN: ERBB receptor binding is elicited entirely by the EGF-like 

CC domain. 

CC PTM: Proteolytic cleavage close to the plasma membrane on the 

CC external face leads to the release of the soluble growth factor 

CC form. 

CC PTM: Extensive glycosylation precedes the proteolytic cleavage. 

CC SIMILARITY: Contains 1 EGF-like domain. 

CC SIMILARITY: Contains 1 immunoglobulin-like C2-type domain. 

CC -!- SIMILARITY: Belongs to the neuregulin family. 

CC 7 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U02315; AAA19940.1; 



DR EMBL; U02316; AAA19941.1; -. 

DR EMBL; U02317; AAA19942.1; 

DR EMBL; U02318; AAA19943.1; -. 

DR EMBL; U02319; AAA19944.1; 

DR EMBL; U02320; AAA19945.1; -. 

DR EMBL; U02321; AAA19946.1; 

DR EMBL; U02322; AAA19947.1; -. 

DR EMBL; U02323; AAA19948.1; -. 

DR EMBL; U02324; AAA19949.1; -. 

DR EMBL; M92430; -; N0T_ANNOTATED_CDS . 

DR PIR; 161718; 161718. 

DR PIR; 161719; 161719, 

DR PIR; 161722; 161722. 

DR HSSP; Q12784; IHRE. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; lEGF. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2 . 

DR InterPro; IPR002154; Neuregulin. 

DR Pfam; PF00008; EGF; 1. 

DR Pfam; PF00047; ig; 1. 

DR Pfam; PF02158; Neuregulin; 1. 



DR 


PRINTS; PR01089; 


NEUREGULIN. 




DR 


SMART; SM00181; EGF; 1. 






DR 


SMART; SM00408; IGc2; 1. 






DR 


PROSITE; 


PS00022; 


EGF_1; 


1. 




DR 


PROSITE; 


PS01186; 


EGF_2 ; 


FALSE_NEG. 


DR 


PROSITE; 


PS50026; 


EGF_3 ; 


1. 




DR 


PROSITE; 


PS50835; 


IG LIKE; 1 




KW 


Growth factor; EGF-like 


domain; Immunoglobulin domain; Glycoprotein; 


KW 


Transmembrane ; Multigene 


family; Alternative splicing. 


FT 


PROPEP 


1 


13 






FT 


CHAIN 


14 


662 




PRO-NEUREGULIN-1, MEMBRANE- BOUND FORM. 


FT 


CHAIN 


14 


264 




NEUREGULIN- 1. 


FT 


DOMAIN 


14 


265 




EXTRACELLULAR (POTENTIAL). 


FT 


TRANSMEM 


266 


288 




INTERNAL SIGNAL SEQUENCE (POTENTIAL) . 


FT 


DOMAIN 


289 


662 




CYTOPLASMIC (POTENTIAL) . 


FT 


DOMAIN 


37 


128 




IG-LIKE C2-TYPE. 


FT 


DOMAIN 


165 


177 




SER/THR-RICH. 


FT 


DOMAIN 


178 


222 




EGF-LIKE. 


FT 


DISULFID 


57 


112 






FT 


DISULFID 


182 


196 




BY SIMILARITY. 


FT 


DISULFID 


190 


210 




BY SIMILARITY. 


FT 


DISULFID 


212 


221 




BY SIMILARITY. 


FT 


CARBOHYD 


120 


120 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


126 


126 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


164 


164 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


VARSPLIC 


213 


256 




PNEFTGDRCQNYVMASFYMTSRRKRQETEKPLERKLDHSLV 


FT 










KES -> QPGFTGARCTENVPMKVQTQE (in isoform 


FT 










Alpha2A, isoform Alpha2B and isoform 


FT 










Alpha2C) . 


FT 










/FTId=VSP 003436. 


FT 


VARSPLIC 


231 


257 




MTSRRKRQETEKPLERKLDHSLVKESK -> KHLGIEFME 


FT 










(in isoform Betal) , 


FT 










/FTId=VSP 003437, 


FT 


VARSPLIC 


231 


241 




MTSRRKRQETE -> STSTPFLSLPE (in isoform 


FT 










Beta3) . 



FT /FTId=VSP_003438. 

FT VARSPLIC 242 662 Missing (in isoform Beta3) , 

FT /FTId-VSP_003439. 

FT VARSPLIC 231 256 Missing (in isoform Beta2 and isoform 

FT BetA2A) . 

FT / FT I d=VS P_ 0 0 3 4 4 0 . 

FT VARSPLIC 325 330 PPENVQ -> RVRTRG (in isoform Beta2) . 

FT / FT I d= VS P_ 0 0 3 4 4 1 . 

FT VARSPLIC 446 662 Missing (in isoform Alpha2C) . 

FT / FT I d= VS P_ 0 0 3 4 4 2 . 

FT VARSPLIC 446 484 YVSAMTTPARMSPVDFHTPSSPKSPPSEMSPPVSSMTVS 

FT -> HNLIAELRRNKAYRSKCMQIQLSATHLRPSSITHLGFI 

FT L (in isoform Alpha2B) . 

FT /FTId=VSP_003443. 

FT VARSPLIC 485 662 Missing (in isoform Alpha2B) . 

FT /FTId=VSP_003444, 

FT CONFLICT 90 90 K -> N (IN REF. 2). 

FT CONFLICT 137 137 T -> I (IN REF. 2; AA SEQUENCE). 

FT CONFLICT 208 208 Y -> S (IN REF. 2). 



Query Match 19.0%; Score 299; DB 1; Length 662; 

Best Local Similarity 33,8%; Pred. No. 2.1e-16; 

Matches 67; Conservative 34; Mismatches 53; Indels 44; Gaps 



6; 



Qy 



Db 



142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS RDIRIKYGNG 198 

I I : I I : I M I II I : I I : : : : I M : I I I I I : I : I : I 

34 ALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNKPENIKIQKKPG 93 



Qy 

Db 

Qy 

Db 



199 



94 



237 



152 



RKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRL YV 

: I I : I I : I : I I I : I : : I I I : : II 

K — SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNEFITGMPASTETAYVSSE 



236 



151 



279 



NSVSTTLSSWSG — HARKCNETAKSYCVNGGVCYYIEGINQLS CK 

I : I : I : I : I I III I : : M I I I I : : : : : I II 
SPIRISVSTEGANTSSSTSTSTTGTSHLIKCAEKEKTFCVNGGECFTVKDLSNPSRYLCK 211 



Qy 



Db 



280 CPVGYTGDRCQQFAMVNF 297 

II : I I I I II : I : I 
212 CPNEFTGDRCQNYVMASF 22 9 



RESULT 7 
NRG1_HUMAN 

ID NRG1_HUMAN STANDARD; PRT; 639 AA. 

AC Q02297; 014667; P98202; Q02298; Q02299; Q07110; Q07111; Q12779; 

AC Q12780; Q12781; Q12782; Q12783; Q12784; Q9UPE3; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Pro-neuregulin-1 precursor (Pro-NRGl) [Contains: Neuregulin-1 (Neu 

DE differentiation factor) (Heregulin) (HRG) (Breast cancer cell 

DE differentiation factor p45) (Acetylcholine receptor inducing activity) 

DE (ARIA) (Sensory and motor neuron-derived factor) (Glial growth 

DE factor) ] . 

GN NRGl OR HGL OR NDF OR HRGA OR GGF OR SMDF. 
OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 



OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORMS 1; 6; 7 AND 8), AND PARTIAL 

RP SEQUENCE. 

RX . MEDLINE=92271253; PubMed=1350381 ; 

RA Holmes W.E., Sliwkowski M.X., Akita R.W., Henzel W.J., Lee J., 

RA Park J.W., Yansura D., Abadi N., Raab H., Lewis G.D., Shepard H.M., 

RA Kuang W.-J., Wood W.I., Goeddel D.V., Vandlen R.L.; 

RT "Identification of heregulin, a specific activator of pl85erbB2."; 

RL Science 256:1205-1210(1992). 

RN [2] 

RP SEQUENCE FROM N.A. (ISOFORMS 2; 3; 4; 6; 7 AND 8). 

RC TISSUE=Kidney adenocarcinoma, and Pituitary; 

RX MEDLINE=94158863; PubMed=7509448 ; 

RA Wen D, , Suggs S.V., Karunagaran D., Liu N. , Cupples R.L., Luo Y., 

RA Janssen A.M., Ben-Baruch N., Trollinger D.B., Jacobsen V.L., 

RA Meng S.-Y., Lu H.S., Hu S., Chang D., Yang W., Yanigahara D. , 

RA Koski R.A. , Yarden Y. ; 

RT "Structural and functional aspects of the multiplicity of Neu 

RT differentiation factors."; 

RL Mol. Cell. Biol. 14:1909-1919(1994). 

RN [3] 

RP SEQUENCE FROM N.A. (ISOFORM 1) . 

RX MEDLINE=92208 945; PubMed=134 82 15 ; 

RA Peles E., Bacus S.S., Koski R.A., Lu H.S., Wen D., Ogden S.G., 

RA Levy R.B., Yarden Y. ; 

RT "Isolation of the neu/HER-2 stimulatory ligand: a 44 kd glycoprotein 

RT that induces differentiation of mammary tumor cells."; 

RL Cell 69:205-216(1992). 

RN [4] 

RP SEQUENCE FROM N.A. (ISOFORMS 8 AND 9). 

RC TISSUE=Brain; 

RX MEDLINE=93205115; PubMed=8096067 ; 

RA Marchionni M.A., Goodearl A.D.J. , Chen M.S., Bermingham-McDonogh O. , 

RA Kirk C, Hendricks M. , Danehy F., Misumi D., Sudhalter J., 

RA Kobayashi K., Wroblewski D., Lynch C, Baldasarre M. , Riles I,, 

RA Davis J.B., Hsuan J. J., Totty N.F., Otsu M., McBurney R.N., 

RA Waterfield M.D., Stroobant P., Gwynne D. ; 

RT "Glial growth factors are alternatively spliced erbB2 ligands 

RT expressed in the nervous system."; 

RL Nature 362:312-318(1993). 
RN [5] 

RP SEQUENCE FROM N.A. OF GAMMA-HEREGULIN FUSION PROTEIN. 

RC TISSUE=Breast cancer; 

RX MEDLINE=97472144; PubMed=9333014 ; 

RA Schaefer G., Fitzpatrick V.D., Sliwkowski M.X.; 

RT "Gamma-heregulin: a novel heregulin isoform that is an autocrine 

RT growth factor for the human breast cancer cell line, MDA-MB- 1 7 5 . " ; 

RL Oncogene 15:1385-1394(1997). 
RN [6] 

RP SEQUENCE OF 1-210 FROM N.A. 

RA Schoumacher F., Herzer S., Flury N., Kueng W., Mueller H., 

RA Eppenberger U.; 

RL Submitted (SEP-1997) to the EMBL/GenBank/DDBJ databases. 
RN [7] 

RP SEQUENCE OF 19-27. 



RX MEDLINE=93366731; PubMed=7689552 ; 

RA Culouscou J. -M., Plowman G.D., Carlton G.W., Green J.M., Shoyab M. ; 

RT "Characterization of a breast cancer cell differentiation factor that 

RT specifically activates the HER4/pl80erbB4 receptor."; 

RL J. Biol. Chem, 268:18407-18410(1993). 

RN [8] 

RP CHROMOSOiyiAL TRANSLOCATION. 

RX MEDLINE=994 55251; PubMed=105238 51 ; 

RA Wang X.-Z., Jolicoeur E.M., Conte N., Chaffanet M. , Zhang Y., 

RA Mozziconacci M.-J., Feiner H., Birnbaum D., Pebusque M,-J., Ron D.; 

RT "Gamma-heregulin is the product of a chromosomal translocation fusing 

RT the D0C4 and HGL/NRGl genes in the MDA-MB-175 breast cancer cell 

RT line."; 

RL Oncogene 18:5718-5721(1999). 

RN [9] 

RP CHROMOSOMAL TRANSLOCATION. 

RX MEDLINE=20065180; PubMed=10597312 ; 

RA Liu X., Baker E., Eyre H.J., Sutherland G.R., Zhou M. ; 

RT "Gamma-heregulin: a fusion gene of DOC-4 and neuregulin-1 derived from 

RT a chromosome translocation."; 

RL Oncogene 18:7110-7114 (1999) . 

RN [10] 

RP STRUCTURE BY NMR OF 175-241 (ISOFORM 1). 

RX MEDLINE=94341264; PubMed=8062828 ; 

RA Nagata K., Kohda D., Hatanaka H., Ichikawa S., Matsuda S,, 

RA Yamamoto T., Suzuki A., Inagaki F. ; 

RT "Solution structure of the epidermal growth factor-like domain of 

RT heregulin-alpha, a ligand for pl80erbB-4 . " ; 

RL EMBO J. 13:3517-3523(1994). 

CC -!- FUNCTION: Direct ligand for ERBB3 and ERBB4 tyrosine kinase 

CC receptors. Concomitantly recruits ERBBl and ERBB2 coreceptors, 

CC resulting in ligand-stimulated tyrosine phosphorylation and 

CC activation of the ERBB receptors. The multiple isoforms perform 

CC diverse functions such as inducing growth and differentiation of 

CC epithelial, glial, neuronal, and skeletal muscle cells; inducing 

CC expression of acetylcholine receptor in synaptic vessicles during 

CC the formation of the neuromuscular junction; stimulating 

CC lobuloalveolar budding and milk production in the mammary gland 

CC and inducing differentiation of mammary tumor cells; stimulating 

CC Schwann cell proliferation; implication in the development of the 

CC myocardium such as trabeculation of the developing heart. 

CC -!- SUBUNIT: The cytoplasmic domain interacts with the LIM domain 

CC region of LIMKl (By similarity) . 

CC -!- SUBCELLULAR LOCATION: Exists as an type I membrane protein and as 
CC a proteolytically released soluble growth factor form. The 

CC membrane-bound form does not seem to be active. The secreted 

CC isoform 9 has a signal peptide. The isoform 8 may be nuclear, 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=9; 

CC Comment=Additional isoforms seem to exist, Isoforms have been 

CC classified as type I NRGS (isoforms with an Ig domain and a 

CC glycosylation domain, isoforms 1-8), type II NRGS (isoforms with 

CC an Ig domain but no glycosylation domain, isoform 9) and type 

CC III NRGS (isoforms with a Cys-rich domain, isoform 10) . All 

CC these isoforms perform distinct tissue-specific functions; 

CC Name=l; Synonyms=Alpha ; 

CC IsoId=Q02297-l; Sequence=Displayed; 



CC Naine=2; Synonyms =Alphal A; 

CC IsoId=Q02297-2; Sequence=VSP_003431 ; 

CC Name=3; Synonyms=Alpha2B; 

CC IsoId=Q02297-3; Sequence=VSP_003434 , VSP_003435; 

CC Name=4; Synonyms=Alpha3 ; 

CC IsoId=Q02297-4; Sequence=VSP_003432 , VSP_003433; 

CC Name=6; Synonyins=Betal , BetalA; 

CC IsoId=Q02297-6; Sequence=VSP_0 0342 8 ; 

CC Name=7; Synonyrtis=Beta2 ; 

CC IsoId-Q02297-7; Sequence=VSP_003427 ; 

CC Name=8; Synonyms=Beta3 , GGFHFBl; 

CC IsoId=Q02297-8; Sequence=VSP_003429, VSP_003430; 

CC Name=9; Synonyins=GGF2 , GGFHPP2; 

CC IsoId=Q02297-9; Sequence=VSP_003425, VSP_003426, VSP_003429, 

CC VSP_003430; 

CC Name=10; Synonyms^SMDF; 

CC IsoId=Q15491-l; Sequence=External; 

CC -!- TISSUE SPECIFICITY: Type I isoforms are the predominant forms 

CC expressed in the endocardium. Isoform alpha is expressed in 

CC breast, ovary, testis, prostate, heart, skeletal muscle, lung, 

CC placenta liver, kidney, salivary gland, small intestine and brain, 

CC but not in uterus, stomach, pancreas, and spleen. Isoform 3 is the 

CC predominant form in mesenchymal cells and in nonneuronal organs, 

CC whereas isoform 5 is the major neuronal form. Isoform 8 is 

CC expressed in spinal cord and brain. Isoform 9 is the major form in 

CC skeletal muscle cells; in the nervous system it is expressed in 

CC spinal cord and brain. Also detected in adult heart, placenta, 

CC lung, liver, kidney, and pancreas. 

CC -!- DEVELOPMENTAL STAGE: Detectable at early embryonic ages. 

CC -!- DOMAIN: The cytoplasmic domain may be involved in the regulation 

CC of trafficking and proteolytic processing. Regulation of the 

CC proteolytic processing involves initial intracellular domain 

CC dimerization (By similarity) . 

CC DOMAIN: ERBB receptor binding is elicited entirely by the EGF-like 

CC domain. 

CC -!- PTM: Proteolytic cleavage close to the plasma membrane on the 

CC external face leads to the release of the soluble growth factor 

CC form. 

CC -!- PTM: Extensive glycosylation precedes the proteolytic cleavage (By 

CC similarity) . 

CC DISEASE: Involved in a rare t(8;ll) chromosomal translocation that 

CC fuses the 5 ' end of 0DZ4 to NRGl (isoform 8). The product of this 

CC translocation was first thought to be an alternatively spliced 

CC isoform, called gamma-heregulin. Gamma-heregulin is a soluble 

CC activating ligand for the ERBB2-ERBB3 receptor complex and acts as 

CC an autocrine growth factor in a specific breast cancer cell line 

CC (MDA-MB-175) . Not detected in breast carcinoma samples, including 

CC ductal, lobular, medullary, and mucinous histological types, 

CC neither in other breast cancer cell lines. 

CC -!- SIMILARITY: Contains 1 EGF-like domain. 

CC -!- SIMILARITY: Contains 1 immunoglobulin-like C2-type domain. 

CC SIMILARITY: Belongs to the neuregulin family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 



cc modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 



DR 


EMBL ; 


My 4 loD ; 


AAAoo ooo . 


-i 




DR 


EMBL; 


M94166; 


AAA58639. 


1, 


_ 


DR 


EMBL; 


M94167; 


AAA58640. 


1 


. _ 


DR 


EMBL; 


M94168; 


AAA58641. 


1 




DR 


EMBL; 


U02325; 


AAA19950. 


1 




DR 


EMBL; 


U02326; 


AAA19951. 


1 




DR 


EMBL; 


U02327; 


AAA19952 . 


1 




DR 


EMBL; 


U02328; 


AAA19953 . 


1 




DR 


EMBL; 


U02329; 


AAA19954 . 


1 




DR 


EMBL; 


U02330; 


AAA19955. 


1 




DR 


EMBL; 


L12260; 


AAB59622 . 


1 




Query Match 


11 A 


3%; 



Best Local Similarity 32.1%; Pred. No. 6.1e-15; 

Matches 69; Conservative 35; Mismatches 60; Indels 51; Gaps 



7; 



Qy 

Db 

Qy 

Db 

Qy 

Db 



126 GKNLKKEVGKILCTDCAT RPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRW 178 

M I I I I : I I : I I : I I I I I I I : II : : : : I 

10 GKGKKKERGSGKKPESAAGSQSPALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKW 69 

179 FKDGKELNRS RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRL- 234 

11:11111 : : I : I : I : I I : I I : I : I I I : I : : I I I : : 
70 FKNGNELNRKNKPQNIKIQKKPGK— SELRINKASLADSGEYMCKVISKLGNDSASANIT 127 

235 YV NSVSTTLSSWSG — HARKCNETAKS 259 

M I : I : I : I : I I III I : 

128 IVESNEIITGMPASTEGAYVSSESPIRISVSTEG7\NTSSSTSTSTTGTSHLVKCAEKEKT 187 



Qy 



Db 



260 YCVNGGVCYYIEGINQLS CKCPVGYTGDRCQQ 291 

: i I I I I I : : : : : I III I : M II : 
188 FCVNGGECFMVKDLSNPSRYLCKCQPGFTGARCTE 222 



RESULT 8 




VEIN_ 


_DROME 




ID 


VEIN DROME STANDARD; PRT; 623 AA. 




AC 


Q94918; Q9VRQ3; 




DT 


16-OCT-2001 (Rel. 40, Created) 




DT 


28-FEB-2003 (Rel. 41, Last sequence update) 




DT 


15-MAR-2004 (Rel. 43, Last annotation update) 




DE 


Vein protein precursor (Epidermal growth factor-like protein) 


DE 


(Defective dorsal discs protein) . 




GN 


VN OR DDD OR CG10491. 




OS 


Drosophila melanogaster (Fruit fly) . 




OC 


Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 


OC 


Neoptera; Endopterygota; Diptera; Brachycera; 


Muscomorpha ; 


OC 


Ephydroidea; Drosophilidae ; Drosophila . 




OX 


NCBI TaxID=7227; 




RN 


[1] 




RP 


SEQUENCE FROM N.A. (ISOFORMS 1 AND 2). 




RC 


TISSUE=Embryo, and Imaginal disks; 




RX 


MEDLINE=9 642 1972; PubMed=8 824589; 




RA 


Schnepp B.C., Grumbling G.B.^ Donaldson T.D.^ 


Simcox A. A. ; 



RT "Vein is a novel component in the Drosophila epidermal growth factor 

RT receptor pathway with similarity to the neuregulins . " ; 

RL Genes Dev. 10:2302-2313(1996). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Berkeley; 

RX MEDLINE=20196006; PubMed=10731132 ; 

RA Adams M.D., Celniker S.E., Holt R.A. , Evans C.A., Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.W,, Hoskins R.A, , Galle R.F., 

RA George R.A., Lewis S.E., Richards S., Ashburner M. , Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G,, Champe M. , Pfeiffer B.D., 

RA Wan K.H., Doyle C, Baxter E.G., Helt G. , Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews-Pf annkoch Baldwin D. , 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y., Benos P.V., Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A., Butler H., Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K., Doup L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K. J., Evangelista C.C., Ferraz Ferriera S., Fleischmann W . , 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K., 

RA Glodek A., Gong F., Gorreli J.H., Gu Z., Guan P., Harris M. , 

RA Harris N.L., Harvey D.A., Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston K.A., Howland T.J., Wei M.-H., Ibegwam C, 

RA Jalali M. , Kalush F., Karpen G.H., Ke Z . , Kennison J. A., Ketchum K.A., 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D., Lai Z., 

RA Lasko P., Lei Y., Levitsky A. A. , Li J.H., Li Z., Liang Y. , Lin X., 

RA Liu X., Mattel B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina N.V., Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy M. , Murphy B., Murphy L. , Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A. , Nixon K., Nusskern D.R., Pacleb J.M., 

RA Palazzolo M., Pittman G.S., Pan S., Pollard J., Purl V., Reese M.G., 

RA Reinert K. , Remington K. , Saunders R.D.C., Scheeler F. , Shen H., 

RA Shue B.C., Siden-Kiamos I., Simpson M. , Skupski M.P., Smith T., 

RA Spier E. , Spradling A.C., Stapleton M. , Strong R., Sun E., 

RA Svirskas R. , Tector C, Turner R. , Venter E., Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T., Worley K.C., Wu D., Yang S., Yao Q.A., 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M. , Zhang G., Zhao Q., Zheng L. , 

RA Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster . " ; 

RL Science 287:2185-2195 (2000) . 

RN [3] 

RP REVISIONS. 

RX MEDLINE=22426069; PubMed=12537572 ; 

RA Misra S., Crosby M.A. , Mungall C.J., Matthews B.B., Campbell K.S., 

RA Hradecky P., Huang Y., Kaminker J.S., Millburn G.H., Prochnik S.E., 

RA Smith CD., Tupy J.L., Whitfield E.J., Bayraktaroglu L., Berman B.P., 

RA Bettencourt B.R., Celniker S.E., de Grey A. D.N. J., Drysdale R.A. , 

RA Harris N.L., Richter J., Russo S., Schroeder A. J., Shu S.Q., 

RA Stapleton M. , Yamada C, Ashburner M. , Gelbart W.M., Rubin G.M., 

RA Lewis S . E. ; 

RT "Annotation of the Drosophila melanogaster euchromatic genome: a 

RT systematic review."; 



RL Genome Biol. 3 : RESEARCH0083 , 1-RESEARCH0083 . 22 (2002 ) . 

CC -!- FUNCTION: Ligand for the EGF receptor. Seems to play a role in 

CC the global proliferation of wing disc cells and the larval 

CC patterning. Shows a strong synergistic genetic interaction with 

CC spi, suggesting a molecular interdependence. Required for the 

CC development of interveins cells. 

CC SUBCELLULAR LOCATION: Secreted. 

CC ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=2; 

CC Name=l; 

CC IsoId=Q94 918-l; Sequence=Displayed; 

CC Name=2 ; 

CC IsoId=Q94918-2; Sequence=VSP_001419 ; 

CC -!- DEVELOPMENTAL STAGE: Expressed in blastoderm embryos in two 

CC ventrolateral stripes that are brought to the midline as 

CC gastrulation proceeds. In the germ-band retraction stage, 

CC expression is seen in the CNS and epidermis. At late blastoderm, 

CC expression is localized in the anlagen of the amnioserosa. 

CC Expression in the head, cypeolabrum, maxillary and labial lobes, 

CC and around the stomodeum throughout embryo development. In late 

CC embryos, expression decays in all ectodermal cells and appears in 

CC the segmental muscles and the gut wall. In the larva, expression 

CC occurs in the dorsal metathoracic disc, the eye-antennal disc and 

CC the ventral thoracic disc. Found in the intervein in the pupa. 

CC -!- SIMILARITY: Contains 1 EGF-like domain. 

CC -!- SIMILARITY: Contains 1 immunoglobulin-like C2-type domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U67935; AAC47293.1; -. 

DR EMBL; AE003564; 7^F50739.2; -. 

DR FlyBase; FBgn0003984; vn. 

DR GO; GO: 0005576; C : extracellular ; NAS . 

DR GO; GO: 0005154; F: epidermal growth factor receptor binding; IMP. 

DR GO; GO: 0007173; P:EGF receptor signaling pathway; IMP. 

DR GO; GO:0007477; P : notum morphogenesis ; IMP. 

DR GO; GO: 0045742; P:positive regulation of EGF receptor signali. . .; NAS. 

DR GO; GO:0007476; P: wing morphogenesis; IMP. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; lEGF. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR00359B; Ig_c2. 

DR Pfam; PF00008; EGF; 1. 

DR Pfam; PF00047; ig; 1. 

DR SMART; SM00181; EGF; 1. 

DR SMART; SM00408; IGc2; 1. 

DR PROSITE; PS00022; EGF_1; 1. 

DR PROSITE; PS01186; EGF_2; FALSE_NEG. 

DR PROSITE; PS50026; EGF_3; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

KW EGF-like domain; Glycoprotein; Immunoglobulin domain; Growth factor; 
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SQ 


SEQUENCE 


623 AA; 


71697 MW, 


AFD2724D5C1F56C8 


CRC64; 


Query Match 




13 


.2%; 


Score 208 


.5; DB 1; 


Length 623; 




Best Local 


Similarity 


25 


.8%; 


Pred. No. 


3. 4e-09; 





(m 



Matches 79; Conservative 42; Mismatches 104; Indels 81; Gaps 15; 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



32 AYKAPVWEG KVQGLVPAGGSSSNSTREPPA 62 

I: M I :| II = Ml: I 

329 AFAAPTVFQGVFKSMSADRRVNFSATMKVEKVYKQQHDLQLPTLVRLQFALSNSSGECD- 387 

63 SGRVALVKVLDKWPLRSGG-LQREQVISVGSCVPLERNQRYIFFLEPTE QP- 112 

: : : : : M I I II : II I : I : : I II 

388 lYRERLMPRGMLRSGNDLQQASDIS YMMFVQQTNPGNFTILGQPM 432 

113 ■ LVFKTAFAPLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAA 168 

II: : I I I I I : : I : I I : j : II 

433 RVTHLWEAVETAVSEN-YTQNAEVTKIF SKPSKAIIKH GKKLRIVCE-V 480 

169 AGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRL QFNKVKVEDAGEYVCEAENIL 225 

: I I I I I II I : I I I : I : : : : : I I II I I I I I I : I 

481 SGQPPPKVTWFKDEKSINRKRNI-YQFKHHKRRSELIVRSFN— SSSDAGRYECRAKNKA 537 

226 GKDTVRGRLYVNSVSTTL-SSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGY 284 

I : I : : : : I M II : I I I M : : I : I I I 

538 SKAIAKRRIMIKASPVHFPTDRSASGIPCN FDYCFHNGTCRMIPDINEVYCRCPTEY 594 



Qy 

Db 



285 TGDRCQ 290 

I : I I : 
595 FGNRCE 600 



RESULT 9 
JAM2 HUMAN 



ID JAM2_HUMAN STANDARD; PRT; 2 98 AA. 

AC P57087; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT lO-OCT-2003 (Rel. 42, Last annotation update) 

DE Junctional adhesion molecule 2 precursor (Vascular endothelial 

DE junction-associated molecule) (VE-JAM) . 

GN JAM2 OR VEJAM OR C210RF43. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleosromi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Vascular endothelial cells; 

RX MEDLINE=2 0317114; PubMed=10779521 ; 

RA Palmeri D., van Zante A., Huang C.C, Hemmerich S., Rosen S.D.; 

RT "Vascular endothelial junction-associated molecule, a novel member of 

RT the immunoglobulin superfamily, is localized to intercellular 

RT boundaries of endothelial cells."; 

RL J. Biol. Chem. 275:19139-19145(2000). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Placenta; 

RX MEDLINE=20507930; PubMed=10945976; 

RA Cunningham S.A., Arrate M.P., Rodriguez J.M., Bjercke R.J., 

RA Vanderslice P., Morris A. P., Brock T.A.; 

RT "A novel protein with homology to the junctional adhesion molecule: 

RT Characterization of leukocyte interactions."; 

RL J. Biol. Chem. 275:34750-34756(2000). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Lung; 

RX MEDLINE-22388257; PubMed=12477932 ; 

RA Strausberg R.L., Feingold E.A. , Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L. , Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J., Hsieh F., 

RA Diatchenko L., Marusina K., Farmer A. A., Rubin G.M., Hong L., 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P. J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A., 

RA Fahey J., Helton E., Ketteman M. , Madan A., Rodrigues S., Sanchez A., 

RA Whiting M., Madan A., Young A.C., Shevchenko Y., Bouffard G.G., 

RA Blakesley R.W. , Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M., 

RA Butterfield Y.S.N. , Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length 

RT human and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:168 99-16903(2 002). 

QQ _!_ FUNCTION: MAY PLAY A ROLE IN THE PROCESSES OF LYMPHOCYTE HOMING TO 
CC SECONDARY LYMPHOID ORGANS. 

CC SUBCELLULAR LOCATION: Type I membrane protein (Potential). 



CC TISSUE SPECIFICITY: PROMINENTLY EXPRESSED ON HIGH ENDOTHELIAL 

CC VENULES BUT IS ALSO PRESENT ON THE ENDOTHELIA OF OTHER VESSELS. 

CC LOCALIZED TO THE INTERCELLULAR BOUNDARIES OF HIGH ENDOTHELIAL 

CC CELLS. 

CC SIMILARITY: Belongs to the immunoglobulin superf amily . 

CC SIMILARITY: Contains 1 immunoglobulin-like V-type domain. 

CC SIMILARITY: Contains 1 immunoglobulin-like C2-type domain. 

CC -!- DATABASE: NAME=PROW; NOTE=PROW 2:1-3(2001); 

CC WWW="http: //www.ncbi . nlm.nih. gov/prow/ guide/ 16524 9218 6_g. htm" . 

CC 7~" 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AF255910; AAF81223.1; 

DR EMBL; AY016009; AAG49022.1; -. 

DR EMBL; BC017779; AM17779.1; -. 

DR Genew; HGNC: 14 686; JAM2 . 

DR MIM; 606870; 

DR GO; GO: 0005887; C:integral to plasma membrane; NAS . 

DR GO; GO: 0016337; P: cell-cell adhesion; NAS. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR Pfam; PF00047; ig; 2. 

DR SMART; SM00408; IGc2; 1. 

DR PROSITE; PS50835; IG_LIKE; 2. 

KW Immunoglobulin domain; Glycoprotein; Transmembrane; Signal. 
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SQ 


SEQUENCE 


298 AA; 


33207 


MW; CA78E518E22DCAEE 


CRC64; 


Query Match 




9. 8^ 


h; Score 155; DB 1; 


Length 298; 



Best Local Similarity 27.7%; Pred. No. 2.8e-05; 
Matches 56; Conservative 24; Mismatches 86; Indels 36; Gaps 8 

96 LERNQRYIFFLEPTEQPLVFKTAFAPLDTN— GKNL-KKEVGKILCTDCATRPKLKKMKS 152 

II: : : : : : : | | : I I | | : : : I I I I : : : : 

66 LGRSVSFVYYQQTLQGD— FKNRAEMIDFNIRIKNVTRSDAGKYRCEVSAPSEQGQNLEE 123 

153 QTGQV GEKQSLKCEAAAGNPQPSYRWFKDGKELNR S 18 8 

I : I I : I : I M I I I I I I I I I 

124 DTVTLEVLVAPAVPSCEVPSSALSGTWELRCQDKEGNPAPEYTWFKDGIRLLENPRLGS 183 



Qy 

Db 



Qy 18 9 RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRG-RLYVNSVSTTLSSWS 247 

: I t I I I I I I I I I I I I I U 11:1::: I 
Db 184 QSTNSSYTMNTKTGTLQFNTVSKLDTGEYSCEARNSVGYRRCPGKRMQVDDLNI S 238 

Qy 248 GHARKCNETAKSYCVNG-GVCY 2 68 

I I I I I I I I 

Db 239 GIIAAWWALVISVCGLGVCY 260 



RESULT 10 
SMDF HUMAN 



ID ~SMDF_HUMAN STANDARD; PRT; 2 96 AA. 

AC Q15491; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 15-MAR-2004 (Rel, 43, Last annotation update) 

DE Neuregulin-1, sensory and motor neuron-derived factor isoform. 

GN NRGl OR HGL OR NDF OR HRGA OR GGF OR SMDF. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo, 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Brain stem, and Cerebellum; 

RX MEDLINE=95301541; PubMed=7782315; 

RA Ho W,-H., Armanini M.P., Nuijens A,, Phillips H.S,, Osheroff P.L.; 

RT "Sensory and motor neuron-derived factor. A novel heregulin variant 

RT highly expressed in sensory and motor neurons."; 

RL J. Biol. Chem. 27 0:14523-14532(1995), 

_f_ FUNCTION: The isoform SMDF may play a role in motor and sensory 

CC neuron development. 

_i_ SUBCELLULAR LOCATION: Secreted. May possess an internal uncleaved 

CC signal sequence, 

CC ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=10; 

CC Comment==Additional isoforms seem to exist. Isoforms have been 

CC classified as type I NRGs (isoforms with an IG domain and a 

CC glycosylation domain, alpha and beta). Type II NRGs (isoforms 

CC with an IG domain but no glycosylation domain - GGF2) and type 

CC III NRGs (isoforms with a cys-rich domain - SMDF) . All these 

CC isoforms perform distinct tissue-specific functions; 

CC Name=10; Synonyms=SMDF; 

CC IsoId=Q15491-l; Sequence=Displayed; 

CC Name=l; Synonyms=Alpha; 

CC IsoId=Q02297-l; Sequence=External ; 

CC Name=2; Synonyms=AlphalA; 

CC IsoId=Q02297-2; Sequence=External ; 

CC Name=3; Synonyms=Alpha2B; 

CC IsoId=Q022 97-3; Sequence=External ; 

CC Name=4; Synonyms=Alpha3 ; 

CC IsoId=Q02297-4 ; Sequence=External ; 

CC Name=5; Synonyms=Betal; 

CC IsoId=Q02297-5; Sequence=External ; 

CC Name=6; Synonyms=BetalA; 

CC IsoId=Q022 97-6; Sequence=External ; 

CC Name=7; Synonyms=Beta2 ; 



CC IsoId=Q02297-7 ; Sequence=External ; 

CC Name=8; SynonyTns=Beta3, GGFHFBl; 

CC IsoId=Q02297-8 ; Sequence=External ; 

CC Name=9; Synonyms=GGF2 , GGFHPP2; 

CC IsoId=Q02297-9; Sequence=External ; 

CC TISSUE SPECIFICITY: Expressed in nervous system: spinal cord motor 

CC neurons, dorsal root ganglion neurons, and brain. Predominant 

CC isoform expressed in sensory and motor neurons. Not detected in 

CC adult heart, placenta, lung, liver, skeletal muscle, kidney, and 

CC pancreas. Not expressed in fetal lung, liver and kidney, 

CC DEVELOPMENTAL STAGE: Highly expressed in developing spinal motor 

CC neurons and in developing cranial nerve nuclei. Expression is 

CC maintained only in both adult motor neurons and dorsal root 

CC ganglion neurons . 

CC SIMILARITY: Contains 1 EGF-like domain. 

CC -!- SIMILARITY: Belongs to the neuregulin family. 

CC — 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; L41827; AAC41764,1; 

DR PIR; A56943; A56943. 

DR HSSP; Q12784; IHRE. 

DR Genew; HGNC:7997; NRGl . 

DR MIM; 142445; -, 

DR GO; GO: 001602 0; C: membrane; NAS . 

DR GO; GO:0007399; P : neurogenesis ; NAS. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; lEGF. 

DR Pfam; PF00008; EGF; 1. 

DR SMART; SM00181; EGF; 1. 
DR 
DR 
DR 
KW 
KW 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 

Best Local Similarity 22.0%; Pred. No. 4.8e-05; 

Matches 71; Conservative 49; Mismatches 116; Indels 86; Gaps 16; 

Qy 21 YSPSLKSVQDQAYKAP WVEGKVQGLVPAGGSSSN STREP PAS GRVALV KVLD 73 

I I I : I : : I : : : I I I I : III I = 

Db 4 YSPDMSEVAAERSSSPSTQLSADPSLDGLPAAEDMPEPQTEDGRTPGLVGLAVPCCACLE 63 



PROSITE; 


PS00022; 


EGF 1; 


1, 


PROSITE; 


PS01186; 


EGF_2 ; 


FALSE NEG. 


PROSITE; 


PS50026; 


EGF 3; 


1. 


Growth factor; EGF 


-like 


domain; Transmembrane; Multigene family; 


Alternative splicing. 
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296 AA; 
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Query Match 




9. 


7%; Score 152; DB 1; Length 296;. 



Qy 74 KWPLRSGGLQREQV ISVGSCVPLERNQRYIF FLEPT -E 110 

II I I I:: :|: I: :::| : II : 

Db 64 AERLR-GCLNSEKICIVPILACLVSLCLCI AGLKWVFVDKI FEYDSPTHLDPGGLGQ 119 

Qy 111 QPLVFKTAFAPLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQV GEKQSLKCEA 167 

I:: II : : :: | | : I I |:| : I 

Db 120 DPII SLDATAAS AVWVSSEAYTSPVSRAQSESEVQVTVQGDKAWSFEP 168 

Qy 168 AAGNPQPSYRWF KDGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYV 218 

: I II II I : :: I : :: I : :: I : 

Db 169 SAA-PTPKNRIFAFSFLPSTAPSFPSPTRNPEVR TPKSATQPQTTETNLQTAPK — 221 

Qy 219 CEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCWGGVCYYIEGINQLS- 277 

I M I : I I III I : : I I I I I I : : : : : I 
Db 222 LSTSTSTTGTS HLVKCAEKEKTFCVNGGECFMVKDLSNPSR 262 

Qy 278 — CKCPVGYTGDRCQQFAMVNF 297 

MM M I M I I : I : I 
Db 263 YLCKCPNEFTGDRCQNYVMASF 284 



RESULT 11 
EGF_MOUSE 

ID EGF_MOUSE STANDARD; PRT; 1217 AA. 

AC P01132; 

DT 21-JUL-1986 (Rel. 01, Created) 

DT 21-JUL-1986 (Rel. 01, Last sequence update) 

DT lO-OCT-2003 (Rel. 42, Last annotation update) 

DE Pro-epidermal growth factor precursor (EGF) [Contains: Epidermal 

DE growth factor] . 

GN EGF. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodent ia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID-10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=83223630; PubMed=6602382 ; 

RA Scott J., Urdea M. , Quiroga M. , Sanchez-Pescador R. , Fong N.M., 

RA Selby M., Rutter W.J., Bell G.I.; 

RT "Structure of a mouse submaxillary messenger RNA encoding epidermal 

RT growth factor and seven related proteins."; 

RL Science 221:236-240(1983). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=83219309; PubMed=6304537 ; 

RA Gray A., Dull T.J., Ullrich A.; 

RT "Nucleotide sequence of epidermal growth factor cDNA predicts a 

RT 128, 000-molecular weight protein precursor."; 

RL Nature 303:722-725(1983). 

RN [3] 

RP SEQUENCE OF 977-1029. 

RX MEDLINE=73048516; PubMed=4636327 ; 

RA Savage C.R. Jr., Inagami T., Cohen S.; 

RT "The primary structure of epidermal growth factor."; 

RL J. Biol, Chem. 247:7612-7621(1972). 



RN [4] 

RP DISULFIDE BONDS. 

RX MEDLINE=74025498; PubMed=4750422 ; 

RA Savage C.R. Jr., Hash J.H., Cohen S.; 

RT "Epidermal growth factor. Location of disulfide bonds."; 

RL J. Biol. Chem. 24 8:7669-7672(1973). 

RN [5] 

RP STRUCTURE BY NMR OF 977-1029. 

RX MEDLINE=92118798; PubMed=1731873 ; 

RA Montelione G.T., Wuethrich K., Burgess A.W. , Nice E.C., Wagner G., 

RA Gibson K.D., Scheraga H.A.; 

RT "Solution structure of murine epidermal growth factor determined by 

RT NMR spectroscopy and refined by energy minimization with 

RT restraints."; 

RL Biochemistry 31:236-24 9(1992). 

RN [6] 

RP STRUCTURE BY NMR OF 977-1029. 

RX MEDLINE=93075811; PubMed=14 45923 ; 

RA Kohda D., Inagaki F. ; 

RT "Three-dimensional nuclear magnetic resonance structures of mouse 

RT epidermal growth factor in acidic and physiological pH solutions."; 

RL Biochemistry 31:11928-11939(1992). 

RN [7] 

RP STRUCTURE BY NMR OF 980-1024. 

RX MEDLINE=99180407; PubMed=l 0082 37 0 ; 

RA Barnham K.J., Torres A.M., Alewood D., Alewood P.F., Domagala T., 

RA Nice E.G., Norton R.S.; 

RT "Role of the 6-20 disulfide bridge in the structure and activity of 

RT epidermal growth factor."; 

RL Protein Sci. 7:1738-1749(1998). 

CC -!- FUNCTION: The growth factor stimulates the growth of various 

CC epidermal and epithelial tissues in vivo and in vitro and of some 

CC fibroblasts in cell culture. 

CC SUBCELLULAR LOCATION: Type I membrane protein. 

CC -!- SIMILARITY: Contains 9 EGF-like domains. 

CC -!- CAUTION: Ref.2 sequence differs from that shown in positions 1134 
CC to 1168 due to a frameshift. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; J00380; AAA37539.1; 

DR EMBL; V00741; CAA24115.1; ALT_FRAME. 

DR EMBL; V00741; CAA24116.1; -. 

DR PIR; A94272; EGMSMG. 

DR PDB; lEGF; 31-J7\N-94. 

DR PDB; 3EGF; 31- JAN-94 . 

DR PDB; lEPG; 31- JAN-94. 

DR PDB; lEPH; 31-JAN-94. 

DR PDB; lEPI; 31- JAN-94. 

DR PDB; lEPJ; 31-JAN-94. 

DR PDB; 1A3P; 29-JUL-98. 



JJK 


PDB; 1GK5; 08-AUG 


-02. 












MGD; MGI: 


95290; Egf. 










JJK 


InterPro; 


IPR000152; Asx_ 


hydroxyl S . 




JJK 


InterPro; 


IPR001336; EGF 


1. 








■n"D 
JJK 


InterPro; 


IPR0018 


81; EGF_ 


Ca. 








UK 


InterPro; 


IPR006209; EGF 


like. 






JJK 


InterPro; 


IPR000033; Ldl 


receptor rep. 




UK 


Pfam; PFOOOOB; EGF; 8. 










UK 


Pfam; PF00058; Idl recept 


_b; 


7. 






UK 


PRINTS; PR00009; 


EGFTGF. 










DR 


SMART; SM0017 9; EGF_CA; 2 










DR 


SMART; SM00135; LY; 7. 










UK 


PROSITE; 


PSOOOlO; 


ASX HYDROXYL; 


3. 




UK 


PROSITE; 


PS00022; 


EGF_1; 


1. 








DR 


PROSITE; 


PS01186; 


EGF 2; 


6. 








DR 


PROSITE; 


PS50026; 


EGF_3 ; 


5. 








DR 


PROSITE; 


PS01187; 


EGF_CA; 


3. 








KW 


EGF-like 


domain; 


Repeat ; 


Growth 


factor; Transmembrane; Glycoprotein 


KW 


Signal; 3D-structure . 










FT 


SIGNAL 


1 


28 




POTENTIAL. 




FT 


CHAIN 


29 


1217 




PRO 


'-EPIDERMAL GROWTH FACTOR. 




FT 


CHAIN 


977 


1029 




EPIDERMAL GROWTH FACTOR. 




FT 


DOMAIN 


29 


1038 




EXTRACELLULAR (POTENTIAL) . 




FT 


TRANSMEM 


1039 


1058 




POTENTIAL. 




FT 


DOMAIN 


1059 


1217 




CYTOPLASMIC (P0TENTI7VL) . 




FT 


DOMAIN 


327 


361 






— LlJ\£j 1 t INLUNfijiljirj j . 




FT 


DOMAIN 


362 


402 




EGF 


-LIKE Zf CALCIUM-BINDING 


( POTENTIAL 


FT 


DOMAIN 


403 


443 




EGF 


-LIKE 3. 




FT 


DOMAIN 


441 


483 




EGF 


-LIKE 4. 




FT 


DOMAIN 


74 7 


TOT 




EGF 


'-LIKE 5. 




FT 


DOMAIN 


o o o 


O T 




EGF 


'-LIKE 6. 




FT 


DOMAIN 


o n '~i 






EGF 


'-LIKE Ir CALCIUM-BINDING 


\ rOLhtN i lAL 


FT 


DOMAIN 


919 


959 




EGF 


'-LIKE 8, CALCIUM-BINDING 


\ rUl rjN 1 lALi 


FT 


DOMAIN 


978 


1019 




EGF 


'-LIKE 9. 




FT 


DISULFID 


366 


377 




BY 


SIMILARITY. 




FT 


DISULFID 


373 


386 




BY 


SIMILARITY. 




FT 


DISULFID 


388 


401 




BY 


SIMILARITY. 




FT 


DISULFID 


407 


418 




BY 


SIMILARITY. 




FT 


DISULFID 


414 


427 




BY 


SIMILARITY. 




FT 


DISULFID 


429 


442 




BY 


SIMILARITY. 




FT 


DISULFID 


445 


457 




BY 


SIMILARITY. 




FT 


DISULFID 


453 


467 




BY 


SIMILARITY. 




FT 


DISULFID 


469 


482 




BY 


SIMILARITY. 




FT 


DISULFID 


751 


762 




BY 


SIMILARITY. 




FT 


DISULFID 


758 


771 




BY 


SIMILARITY. 




FT 


DISULFID 


773 


786 




BY 


SIMILARITY. 




FT 


DISULFID 


842 


853 




BY 


SIMILARITY. 




FT 


DISULFID 


847 


862 




BY 


SIMILARITY. 




FT 


DISULFID 


864 


875 




BY 


SIMILARITY. 




FT 


DISULFID 


881 


895 




BY 


SIMILARITY. 




FT 


DISULFID 


888 


904 




BY 


SIMILARITY. 




FT 


DISULFID 


906 


917 




BY 


SIMILARITY. 




FT 


DISULFID 


923 


936 




BY 


SIMILARITY. 




FT 


DISULFID 


930 


945 




BY 


SIMILARITY, 




FT 


DISULFID 


947 


958 




BY 


SIMILARITY. 




FT 


DISULFID 


982 


996 










FT 


DISULFID 


990 


1007 











FT 


DT STIT.FTn 


1009 


1018 






FT 




X u ^ *± 


102 9 


MnT RFHTTTRFD FHR T 


D X U Xi U X ^/tXj 


FT 








ArTTVTTY 

1 X V X X X > 




FT 




111 

XXX 


111 

XXX 


IN XjX1NJ\£j]J V '^-Li^>^lN/-vt^ . , . 


\ ^ DrsnPPMT'T AT. \ 


FT 




4 1 n 

*± X u 


'-I J. \J 


M_T TM'K'F'n /(^TPMAP 

IN Xj X IN Ja i_j 1^ V '^-'-''^''■N/t.O . • • 


\ / D/^nPTT'KTT'T AT ^ 


FT 
r 1. 




O X u 


fil ft 
o X u 


IN XiXINiMJi^ \ oXjk-^iN/\L^ , 




FT 




QA A 


Q4 4 




; \ rCji i XAXj J 


FT 
r J. 


PDMFT.TPT 


/ J? u 


/ J? u 


n — ^ Y ^ TM RFF 0\ 




FT 


rnNFT.TPT 


X U ^ 0 


1 (14 fl 

X U rt O 


A — ^ TTJ RFTT 0\ 
/\ ^ O ^ XlN r\Ejr . -ii / . 




FT 


1 U KIN 


_7 0 .3 


Q ft 4 
J? 0 ft 






r J. 


c rp p a "NT "n 


QQ ^ 


Q Q ft 






r 1 


i U KIN 


1 n n n 
X u u u 


1 nn9 
X u uz 






FT 




1 nns 

X VJ U J 


1 onft 

X u u o 






FT 


TURN 


1011 


1012 






FT 


STRAND 


1013 


1014 






FT 


TURN 


1015 


1018 






FT 


STRAND 


1020 


1021 






SQ 


SEQUENCE 


1217 


AA; 133144 


MW; A9C7F3D512F8287 3 


CRC64; 



Query Match 9.5%; Score 150; DB 1; Length 1217; 

Best Local Similarity 23.4%; Pred. No. 0.00037; 

Matches 64; Conservative 31; Mismatches 104; Indels 74; 



Gaps 



11; 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



65 RVALVKVLD KWPLRSG GLQREQVISVGSCVPLERNQRYI FFLEPTEQPLV 114 

I III : I : M I : I I I : : I I : 

774 REGFVKAWDGKMCLPQDYPILSGENADLSKE-VTSLSNSTQ7VE VPDDDGTE 823 

115 FKTAFAPLDTNGKNLKKEVGKILC TDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGN 171 

I I : : I I : : I I I : I : I I I I I I I 

824 SSTLVAEIMVSGMNYEDDCGPGGCGSHARCVS DGETAECQCLKGFARDGN 873 

172 PQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGE 216 

III : I : I : : : | 

874 LCSDIDECVLARSDCPSTSSRC INTEGGYVCRCSEGYEGDGISCFDIDECQRGA 927 

217 YVCEAEN ILGKDTVRGRLYVNSVSTTLSSWSGHARK CNETA 257 

: I I I I I : : II : I : : I | | | : 

928 HNC-AENAACTNTEGGYNCTCAGRPSSPGRSCPDSTAPSLLGEDGHHLDRNSYPGCPSSY 98 6 



Qy 



Db 



258 KSYCVNGGVCYYIEGINQLSCKCPVGYTGDRCQ 290 

I I : I II I I : I I : : : I I : II : II I I I 
987 DGYCLNGGVCMHIESLDSYTCNCVIGYSGDRCQ 1019 



RESULT 12 
DSCA_HUMAN 

ID DSCA__HUMAN STT^DARD; PRT; 2012 AA. 

AC 060469; O60468; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT lO-OCT-2003 (Rel. 42, Last annotation update) 

DE Down syndrome cell adhesion molecule precursor (CHD2) . 

GN DSCAM. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
OX NCBI TaxID=9606; 



RN [1] 

RP SEQUENCE FROM N.A., AND ALTERNATIVE SPLICING. 

RC TISSUE=Brain; 

RX MEDLINE=9 8 087574; PubMed=942 625 8 ; 

RA Yamakawa K., Huot Y.-K., Haendelt M.A. , Hubert R. , Chen X.-N., 

RA Lyons G.E., Korenberg J.R.; 

RT "DSCAM: a novel member of the immunoglobulin superfamily maps in a 

RT Down syndrome region and is involved in the development of the 

RT nervous system."; 

RL Hum. Mol. Genet. 7:227-237(1998). 

RN [2] 

RP SEQUENCE FROM N.A. , AND FUNCTION. 

RX MEDLINE=2 0384 934; PubMed=1092514 9 ; 

RA Agarwala K.L., Nakamura S., Tsutsumi Y., Yamakawa K. ; 

RT "Down syndrome cell adhesion molecule DSCAM mediates homophilic 

RT intercellular adhesion."; 

RL Brain Res. Mol. Brain Res. 7 9:118-126(2000). 

RN [3] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20289799; PubMed=108 30953 ; 

RA Hattori M. , Fujiyama A., Taylor T.D., Watanabe H., Yada T., 

RA Park H.-S., Toyoda A., Ishii K., Totoki Y., Choi D.-K., Groner Y., 

RA Soeda E., Ohki M., Takagi T., Sakaki Y., Taudien S,, Blechschmidt K., 

RA Polley A., Menzel U., Delabar J. ^ Kumpf K., Lehmann R. , Patterson D., 

RA Reichwald K., Rump A., Schillhabel M. , Schudy A., Zimmermann W. , 

RA Rosenthal A., Kudoh J., Shibuya K., Kawasaki K. Asakawa S., 

RA Shintani A., Sasaki T., Nagamine K. , Mitsuyama S., Antonarakis S.E., 

RA Minoshima S.,' Shimizu N., Nordsiek G., Hornischer K. , Brandt P., 

RA Scharfe M., Schoen 0., Desario A., Reichelt J., Kauer G., Bloecker H., 

RA Ramser J., Beck A., Klages S., Hennig S.^ Riesselmann L., Dagand E,, 

RA Wehrmeyer S., Borzym K., Gardiner K., Nizetic D., Francis F. , 

RA Lehrach H., Reinhardt R. , Yaspo M.-L.; 

RT "The DNA sequence of human chromosome 21."; 

RL Nature 405:311-319(2000). 

CC FUNCTION: CELL ADHESION MOLECULE THAT CAN MEDIATE CATION- 

CC INDEPENDENT HOMOPHILIC BINDING ACTIVITY. COULD BE INVOLVED IN 

CC NERVOUS SYSTEM DEVELOPMENT. 

CC -!- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN (PROBABLE). THE 
CC SHORT ISOFORM MAY BE SECRETED. 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event-Alternative splicing; Named isoforms=2; 

CC Name=Long; Synonyms=CHD2-52 ; 

CC IsoId=O604 69-l; Sequence=Displayed; 

CC Name=Short; Synonyms=CHD2-42 ; 

CC IsoId=O60469-2; Sequence=VSP_002502 , VSP_002503; 

CC -!- TISSUE SPECIFICITY: Primarily expressed in brain. 

CC -!- SIMILARITY: Contains 10 immunoglobulin-like C2-type domains. 

CC SIMILARITY: Contains 6 fibronectin type III domains. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 



DR EMBL; AF023450; AAC17967.1 

DR EMBL; AF023449; AAC17966,1 

DR EMBL; AF217525; AAF27525.1 

DR EMBL; AL163283; CAB90464.1 

DR EMBL; AL163282; C7VB90436.1 

DR EMBL; AL163281; CAB90444,1 

DR Genew; HGNC:3039; DSCAM. 

DR MIM; 602523; -. 

DR GO; GO: 0005887; C: integral to plasma membrane; TAS . 

DR GO; GO:0005624; Ccmembrane fraction; TAS. 

DR GO; GO: 0007155; P:cell adhesion; TAS. 

DR GO; GO: 0007399; P : neurogenesis ; TAS. 

DR InterPro; IPR008957; FN_III-like. 

DR InterPro; IPR003961; FN_III. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2 , 

DR Pfam; PF00041; fn3; 6. 

DR Pfam; PF00047; ig; 9. 

DR SMART; SM00060; FN3; 6. 

DR SMART; SM00408; IGc2; 7. 

DR PROSITE; PS50835; IG_LIKE; 9. 

KW Immunoglobulin domain; Glycoprotein; Signal; Cell adhesion; Repeat; 

KW Transmembrane; Alternative splicing. 
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IGQVTSYICLHTLEWTFC 


(IN REF. 1) , 


SQ 


SEQUENCE 


2012 


AA; 222259 


MW; 0E33CFB781A08334 CRC64; 



Query Match 9.2%; Score 145.5; DB 1; Length 2012; 

Best Local Similarity 24.1%; Pred. No. 0.0015; 

Matches 77; Conservative 34; Mismatches 111; Indels 97; Gaps 15; 

Qy 5 PAPGFSMLLFGVSLAC — YSPSLK-SVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREPP 61 

I III: : I : I I I II II : III 
Db 87 PPSSFSTLIHDNTYYCTAENPSGKIRSQDVHIKAVL REP- 125 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



62 ASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFL— EPTEQPLVFKTAF 119 
|:| I: :| ::| l: : || : | II : I 

126 YTVRVEDQKTMRGN VAVFKCIIPSSVEAYITWSWEKDTVSLVSGSRF 173 



120 APLDTNG- 



-LKKMK 151 



-KNLKKEVGKILCTDCATRPK 

I I : : : I I : I M : I 

174 LITSTGALYIKDVQNEDG-LYNYRCITRHRYTGETRQSNSARLFVSDPANSAPSILDGFD 232 

152 SQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN-SRLQFNKVK 210 

: I : : I I : I I : I : I I I I t I I I I : I : I : : 

233 HRKAMAGQRVELPCK-ALGHPEPDYRWLKDNMPLELS GRFQKTVTGLLIENIR 284 

211 VEDAGEYVCEAENILGKDTVRGRLYVNS-VSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 

I : I I I I I I I I II I I I : I : I I I : I 
285 PSDSGSYVCEVSNRYGTAKVIGRLYVKQPLKATIS PRKVKSSVGS 32 9 

270 lEGINQLSCKCPVGYTGDR 288 

I : I II II: 
330 QVSLSCSVTGTEDQ 343 



RESULT 13 
LAMP RAT 



ID LAMP_RAT STANDARD; PRT; 338 AA. 

AC Q62 813; 

DT Ol-NOV-1997 (Rel. 35, Created) 

DT Ol-NOV-1997 (Rel. 35, Last sequence update) 

DT lO-OCT-2003 (Rel. 42, Last annotation update) 

DE Limbic system-associated membrane protein precursor (LSAMP) . 

GN LSAMP OR LANP . 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A., AND SEQUENCE OF 29-49. 

RC TISSUE=Hippocampus; 

RX MEDLINE=95374785; PubMed=764 6886; 

RA Piraenta A.F., Zhukareva V., Barbe M.F., Reinoso B.S., Grimley C, 

RA Henzel W. , Fischer I., Levitt P.; 

RT "The limbic system-associated membrane protein is an Ig superfamily 

RT member that mediates selective neuronal growth and axon targeting."; 

RL Neuron 15:287-297(1995). 

CC -!- FUNCTION: MEDIATES SELECTIVE NEURONAL GROWTH AND AXON TARGETING. 
CC CONTRIBUTES TO THE GUIDANCE OF DEVELOPING AXONS AND REMODELING OF 

CC MATURE CIRCUITS IN THE LIMBIC SYSTEM. ESSENTIAL FOR NORMAL GROWTH 

CC OF THE HYPPOCAMPAL MOSSY FIBER PROJECTION. 

CC -!- SUBCELLULAR LOCATION: Attached to the membrane by a GPI-anchor. 
CC TISSUE SPECIFICITY: Expressed mostly by neurons comprising limbiC- 

CC associated cortical and subcortical regions that function in 

CC cognition, emotion, memory, and learning. 

CC DEVELOPMENTAL STAGE: FIRST DETECTED AT E15-16, AT STAGE E20 IT IS 

CC DETECTED IN PRESUMPTIVE CORTEX, MEDIAL LIMBIC AREAS OF THE 

CC THALAMUS AND HYPOTHALAMUS. IN THE ADULT, IT IS FOUND IN 

CC HYPOTHALAMUS, PERIRHINAL CORTEX, AMYGDALA AND MEDIAL THALAMIC 

' CC REGION. 

CC -!- SIMILARITY: Belongs to the immunoglobulin superfamily. IgLON 
CC family. 

CC -!- SIMILARITY: Contains 3 immunoglobulin-like C2-type domains. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC . 

DR EMBL; U31554; AAA86120.1; 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR Pfam; PF00047; ig; 3. 

DR SMART; SM00408; IGc2; 2. 

DR PROSITE; PS50835; IG_LIKE; 3. 

KW Immunoglobulin domain; Cell adhesion; Glycoprotein; GPI-anchor; 

KW Repeat; Signal; Lipoprotein. 

FT SIGNAL 1 28 

FT CHAIN 29 315 LIMBIC SYSTEM- ASSOCIATED MEMBRANE 

FT PROTEIN. 

FT PROPEP 316 338 REMOVED IN MATURE FORM (POTENTIAL) . 



FT 


DOMAIN 


29 


122 


IG-LIKE C2-TYPE 1. 




FT 


DOMAIN 


132 


214 


IG-LIKE C2-TYPE 2. 




FT 


DOMAIN 


219 


304 


IG-LIKE C2-TYPE 3. 




FT 


DISULFID 


53 


111 


POTENTIAL 






FT 


DISULFID 


153 


197 


POTENTIAL. 




FT 


DISULFID 


239 


290 


POTENTIAL 






FT 


CARBOHYD 


40 


40 


N-LINKED 


(GLCNAC. . 


\ ^PnTTTMTTaT \ 
• ; ^ IT U 1 iLilN 1 X r\Lt ) . 


FT 


CARBOHYD 


66 


66 


N-LINKED 


(GLCNAC. . 


\ ^POT^FNTTTZiT \ 
. ) \ r\J 1 SltVi 1 Xr\lj } . 


FT 


CARBOHYD 


136 


136 


N-LINKED 


(GLCNAC. . 


• ) \ r\J 1 ]lji>i 1 } • 


FT 


CARBOHYD 


148 


148 


N-LINKED 


(GLCNAC. . 


.) (POTENTIAL). 


FT 


CARBOHYD 


279 




N-LINKED 


(GLCNAC. . 




FT 


CARBOHYD 


287 


287 


N-LINKED 


(GLCNAC. . 




FT 


CARBOHYD 


300 


300 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


315 


315 


N-LINKED 


(GLCNAC, . 


.) (POTENTIAL). 


FT 


LIPID 


315 


315 


GPI-anchor amidated 


asparagine 


FT 








(Potential) . 


SQ 


SEQUENCE 


338 AA; 


37324 


MW; 0B7 6AFDD68A39BB6 


CRC64; 



Query Match 9.2%; Score 145; DB 1; Length 338; 

Best Local Similarity 23.9%; Pred, No. 0.0002; 

Matches 55; Conservative 31; Mismatches 92; Indels 52; Gaps 9; 

Qy 53 SSNSTREPPASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQP 112 

I : I I I : I : I I : : : :: | | | | : : | 

Db 112 SVQTQHEPKTSQVYLIVQV PPKI SNISSDVTVNEGSNVTL VCMANGRPEP 161 

Qy 113 LVFKTAFAPL DTNGKNLKKEVGKILCTDCAT RPKL 147 

: : II : : I I I : : I | : 

Db 162 VITWRHLTPLGREFEGEEEYLEILGITREQSGKYECKAANEVSSADVKQVKVTVNYPPTI 221 

Qy 148 KKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFN 2 07 

: I I I : II I I I I : I I I : I : : I : | : : | | | : | | 

Db 222 TESKSNEATTGRQASLKCEASA-VPAPDFEWYRDDTRINSANGLEIKSTEGQ — SSLTVT 278 

Qy 208 KVKVEDAGEYVCEAENILG KDTVRGRLYVN-SVSTTLSSW 246 

I I I I I I I I I : I I I : I I : I : I 
Db 279 NVTEEHYGNYTCVAANKLGVTNASLVLFRPGSVRG INGSISLAVPLW 325 



RESULT 14 
LAMP_HUMAN 

ID LAMP_HUM7\N STANDARD; PRT; 338 AA. 

AC Q13449; 

DT Ol-NOV-1997 (Rel. 35, Created) 

DT Ol-NOV-1997 (Rel. 35, Last sequence update) 

DT lO-OCT-2003 (Rel. 42, Last annotation update) 

DE Limbic system-associated membrane protein precursor (LSAMP) . 

GN LSAMP OR LAMP. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI__TaxID=9606; 
RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-96235133; PubMed=8666243 ; 

RA Pimenta A.F., Fischer I., Levitt P.; 

RT "cDNA cloning and structural analysis of the human limbic- system- 



RT 

RL 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

KW 

KW 



_ t . 



associated membrane protein (LAMP)."; 
Gene 17 0:189-195(1996). 

FUNCTION: MEDIATES SELECTIVE NEURONAL GROWTH AND AXON TARGETING. 
CONTRIBUTES TO THE GUIDANCE OF DEVELOPING AXONS AND REMODELING OF 
MATURE CIRCUITS IN THE LIMBIC SYSTEM. ESSENTIAL FOR NORMAL GROWTH 
OF THE HYPPOCAMPAL MOSSY FIBER PROJECTION (BY SIMILARITY) . 
SUBCELLULAR LOCATION: Attached to the membrane by a GPI-anchor. 
TISSUE SPECIFICITY: Expressed on limbic neurons and fiber tracts 
as well as in single layers of the superior colliculus, spinal 
chord and cerebellum. 

SIMILARITY: Belongs to the immunoglobulin superfamily. IgLON 
family . 

-I- SIMILARITY: Contains 3 immunoglobulin-like C2-type domains. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 

EMBL; U41901; AAC50569.1; 
PIR; JC4776; JC4776. 
Genew; HGNC:6705; LSAMP . 
MIM; 603241; 

GO; GO: 0007399; P : neurogenesis ; TAS . 
InterPro; IPR007110; Ig-like. 
InterPro; IPR003598; Ig_c2. 
Pfam; PF00047; ig; 3. 
SMART; SM00408; IGc2; 2. 
PROSITE; PS50835; IG_LIKE; 3. 

Immunoglobulin domain; Cell adhesion; Glycoprotein; GPI-anchor; 
Repeat; Signal; Lipoprotein. 

POTENTIAL. 

LIMBIC SYSTEM-ASSOCIATED MEMBRANE 
PROTEIN. 

REMOVED IN MATURE FORM (POTENTIAL) . 
IG-LIKE C2-TYPE 1. 
IG-LIKE C2-TYPE 2. 
IG-LIKE C2-TYPE 3. 
POTENTIAL. 
POTENTIAL. 
POTENTIAL. 
N-LINKED (GLCNAC 
N-LINKED (GLCNAC 
N-LINKED (GLCNAC 
N-LINKED (GLCNAC 
N-LINKED 



FT 


SIGNAL 


1 


28 


FT 


CHAIN 


29 


315 


FT 








FT 


PROPER 


316 


338 


FT 


DOMAIN 


29 


122 


FT 


DOMAIN 


132 


214 


FT 


DOMAIN 


219 


304 


FT 


DISULFID 


53 


111 


FT 


DISULFID 


153 


197 


FT 


DISULFID 


239 


290 


FT 


CARBOHYD 


40 


40 


FT 


CARBOHYD 


66 


66 


FT 


CARBOHYD 


136 


136 


FT 


CARBOHYD 


148 


148 


FT 


CARBOHYD 


279 


279 


FT 


CARBOHYD 


287 


287 


FT 


CARBOHYD 


300 


300 


FT 


CARBOHYD 


315 


315 


FT 


LIPID 


315 


315 



FT 
SQ 



SEQUENCE 338 AA; 37308 MW; 



(GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
GPI-anchor amidated 
(Potential) . 

03455F286DF5D92F CRC64; 



(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
asparagine 



Query Match 



9.1%; Score 144; DB 1; Length 338; 



Best Local Similarity 25.0%; Pred. No. 0.00025; 

Matches 56; Conservative 34; Mismatches 94; Indels 40; Gaps 10; 

NSTREPPASGRVALVKVLDKWPLRSGGLQREQVISVGS CVPLERNQRYIFF — 105 

: II I :|:| I : : : :: 11 |: I : I : 



Qy 


53 


Db 


112 


Qy 


106 




1 bo 


Qy 


154 


Db 


228 


Qy 


214 


Db 


285 



-DTNGKNLKKEVGKILCTDCAT RPKLKKMKSQ 153 

: : I I I :: I I : : I I 



I I I I I I : I I I : I : : I : | : : | | | : | | 



I I I II :|ll :| |:| 



RESULT 15 
CEPU CHICK 



ID CEPU_CHICK STANDARD; PRT; 353 AA. 

AC Q90773; 

DT Ol-NOV-1997 (Rel. 35, Created) 

DT Ol-NOV-1997 (Rel. 35, Last sequence update) 

DT lO-OCT-2003 (Rel. 42, Last annotation update) 

DE CEPU-1 protein precursor. 

OS Gallus gallus (Chicken) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 

OC Gallus . 

OX NCBI_TaxID=9031; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORMS 1 AND 2) . 

RC TISSUE=Brain; 

RX MEDLINE=96370549; PubMed=8774445 ; 

RA Spaltmann F., Bruemmendorf T.; 

RT "CEPU-1, a novel immunoglobulin superfamily molecule, is expressed by 

RT developing cerebellar Purkinje cells."; 

RL J. Neurosci. 16:1770-1779(1996), 

CC -!- FUNCTION: It may be a cellular address molecule specific to 

Purkinje cells. It may represent a receptor or a subunit of a 
CC receptor complex. 

CC -!- SUBCELLULAR LOCATION: Attached to the membrane by a GPI-anchor. 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=2; 

CC Name=l; Synonyms=Minor ; 

CC IsoId=Q90773-l; Sequence=Displayed; 

CC Name=2; Synonyms=Ma j or ; 

CC IsoId=Q90773-2; Sequence=VSP_002607 ; 

CC -!- TISSUE SPECIFICITY: Found on the dendrites, somata and axons of 
CC developing Purkinje cells. Undetectable on other neurons like 

CC Golgi or granule cells. 

CC DEVELOPMENTAL STAGE: Expressed by developing cerebellar Purkinje 

CC cells. Expression coincides with the growth of the dendritic tree, 

CC after Purkinje cells have finished their migration from the 



cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 



ventricular zone (from E15 until E21). Expressed in the adult. 

-!- SIMILARITY: Belongs to the immunoglobulin superfamily. IgLON 
family. 

SIMILARITY: Contains 3 immunoglobulin-like C2-type domains. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) , 



DR 


EMBL; Z72497; CAA96578. 


1; -. 






DR 


InterPro; 


IPR007110; Ig 


-like 


• 




DR 


InterPro; 


IPR00359 


8; Ig 


c2. 






DR 


Pfam; PF00047; 


ig; 


3. 








DR 


SMART; SM00408; 


IGc2; 2 








DR 


PROSITE; 


PS50835; 


IG LIKE; 3 






KW 


Immunoglobulin 


domain; 


Cell 


adhesion ; Glycoprotein ; 


GPI-anchor; 


KW 


Repeat; Signal; 


Alternative 


splicing; Lipoprotein. 




FT 


SIGNAL 


1 




28 




POTENTIAL. 




r 1 


CHAIN 


29 




330 




CEPU-1 PROTEIN. 




FT 


PROPEP 


331 




353 




REMOVED IN MATURE FORM 


(POTENTIAL) . 


FT 


DOMAIN 


37 




124 




IG-LIKE C2-TYPE 1. 


FT 


DOMAIN 


134 




216 




IG-LIKE C2-TYPE 2. 




FT 


DOMAIN 


220 




314 




IG-LIKE C2-TYPE 3. 




FT 


DISULFID 


55 




113 




POTENTIAL. 




FT 


DISULFID 


155 




199 




POTENTIAL. 




FT 


DISULFID 


241 




293 




POTENTIAL. 




FT 


CARBOHYD 


42 




42 




N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


68 




68 




N-LINKED (GLCNAC. . .) 


(POTENTIAL) , 


FT 


CARBOHYD 


150 




150 




N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


282 




282 




N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


C7VRB0HYD 


290 




290 




N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


303 




303 




N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


LIPID 


330 




330 




GPI-anchor amidated serine (Potential) 


FT 


VARSPLIC 


310 




320 




Missing (in isoform 2) . 




FT 












/FTId=VSP 002607. 




SQ 


SEQUENCE 


353 AA; 


38736 MW; 2550C48591EBBBA6 CRC64; 



Query Match 9.0%; . Score 141.5; DB 1; 

Best Local Similarity 36.5%; Pred. No. 0.00041; 
Matches 38; Conservative 11; Mismatches 52; 



Length 353; 
Indels 3; 



Gaps 



3; 



Qy 

Db 

Qy 

Db 



145 PKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRL 204 

I • II 11:11111:11 ::|:|| I I : :| | ||| 

221 PYISDAKSTGVPVGQKGILMCEASA-VPSADFQWYKDDKRLAEGQK-GLKVENKAFFSRL 278 

205 QFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSG 248 

I I :| I I I I I II II :||: I I 

279 TFFNVSEQDYGNYTCVASNQLGNTNASMILY-EETTTALTPWKG 321 



Search completed: August 17, 2004, 14:11:18 
Job time : 9.5414 sees 



